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Abstract 

The seminal work by Edmonds [10] and Lovasz [40] shows the strong connection between submodular 
functions and convex functions. Submodular functions have tight modular lower bounds, and a subdiffer¬ 
ential structure [17] in a manner akin to convex functions. They also admit polynomial time algorithms for 
minimization and satisfy the Fenchel duality theorem [19] and the discrete separation theorem [16], both 
of which are fundamental characteristics of convex functions. Submodular functions also have properties 
similar to concavity. For example, submodular function maximization, though NP hard, admits constant 
factor approximation guarantees. Concave functions composed with modular functions are submodular, 
and they also show the diminishing returns property. In this manuscript, we try to provide a more 
complete picture on the relationship between submodularity and both convexity and concavity — we 
do this by extending many of the results connecting submodularity with convexity [40, 16, 19, 10, 17] 
to the concave aspects of submodular functions. We first show the existence of superdifferentials (a 
polyhedral partitioning of R v ) and efficiently computable tight modular upper bounds of a submodular 
function. While we show that it is hard to characterize these polyhedra, we obtain inner and outer 
bounds on the superdifferential along with certain specific and useful supergradients. We then investigate 
forms of concave extensions of submodular functions and show interesting relationships to submodular 
maximization. We next show connections between optimality conditions over the superdifferentials and 
submodular maximization, and show how forms of approximate optimality conditions translate into 
approximation factors for maximization. We end this paper by studying versions of a “concave” discrete 
separation theorem and the Fenchel duality theorem when seen from the concave point of view. In every 
case, we relate our results to the existing results from the convex point of view, thereby improving the 
analysis of the relationship between submodularity, convexity, and concavity. 


1 Introduction 

Long known to be an important property for problems in combinatorial optimization, economics, operations 
research, and game theory, submodularity is gaining popularity in a number of new areas including machine 
learning. Along with its natural connection to many application domains, it also admits a number of 
interesting theoretical characterizations. A function / : 2 V —► K. over a ground set V = {1,2, ••• ,n} is 
submodular if for all subsets S,T C V, it holds that, 

f(S) + f(T) >/(SUT) + /(SnT). (1) 

Equivalently, a submodular set function satisfies diminishing marginal returns: Define f(j\S) = f(S U {j}) — 
/ (S') as the marginal cost of element j £ V with respect to S C V. 1 The diminishing returns property states 
that, 

fU\S) > f(j\T),VS C T and j ^ T. (2) 

x We also use this notation for sets A, B as in f(A\B) = f(A U B) — f(B). 
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Figure 1: Convex and Concave functions with sub and super gradients respectively 


Through the rest of the paper below, we shall also assume without loss of generality that /(0) = 0. 

Submodularity and convexity: Submodular functions have been strongly associated with convex func¬ 
tions, to the extent that sub modularity is sometimes regarded as a discrete analogue of convexity [20]. 
This relationship is evident by the fact that submodular function minimization is easy in that there exist 
strongly polynomial time algorithms which achieve it. This is akin to convex minimization which is also 
easy. A number of recent results, however, make this relationship much more formal. For example, similar 
to convex functions, submodular functions have tight modular lower bounds and admit a subdifferential 
characterization [17]. Moreover, it is possible [19] to provide optimality conditions, in a manner analogous to 
the Karush-Kuhn-Tucker (KKT) conditions from convex programming, for submodular function minimization. 
Furthermore, the Fenchel duality theorem and the discrete separation theorem, both of which are known 
to hold for convex functions have been shown to hold also for submodular functions [19, 16]. Submodular 
functions also admit a natural convex extension, known as the Lovasz extension, that is easy to evaluate [40] 
and optimize. The Lovasz extension, moreover, also has no integrality gap and minimizing a submodular 
function is equivalent to minimizing its Lovasz extension. All these results show that submodularity is indeed 
closely related to convexity, and seems to verify the claim that submodularity is “the” discrete analog of 
convexity. 

Submodular functions and concavity: Submodular functions also have properties that are unlike 
convexity and are more akin to concavity. Submodular function maximization is known to be NP hard. 
However, there exist a number of constant factor approximation algorithms based on simple greedy or 
local search heuristics [12, 37, 45] and some recent continuous approximation methods [6, 13]. This is 
unlike convexity where maximization can be hopelessly difficult [49]. Furthermore, submodular functions 
have a diminishing returns property which is similar to concavity, and concave over modular functions are 
known to be submodular. In addition, submodular functions have been shown to have tight modular upper 
bounds [31, 27, 26, 33, 35], and as we show, possess superdifferentials and supergradients very much like concave 
functions. The multi-linear extension of a submodular function, which is useful [6] for example in the context of 
submodular maximization, is known to be concave when restricted to a particular direction. All these seem to 
indicate that submodular functions are related both to convexity and to concavity. In some sense, submodular 
functions are strange and lucky — convex and concave functions each have distinct and useful properties, 
while submodular functions have best of both worlds. In this paper, we formalize these relationships. 

1.1 Motivation and Past Work 

For more than four decades, researchers have been investigating theoretical and algorithmic aspects of 
submodular functions. The bulk of this work [20, 10, 40, 17, 16, 18] has been in relating submodular functions 
to convexity from a polyhedral perspective, thereby culminating in efficient algorithms for submodular 
minimization. From a polyhedral perspective, Fujishige, Edmonds and others [20, 10, 17], provided a 
characterization of the submodular polyhedron, the base polytope, and subdifferentials of submodular 
functions. Lovasz [40] then provided an efficient characterization of the convex extension of a submodular 
function, which has become known as the Lovasz extension and also the Choquet integral [7]. The connection 
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between submodularity and convexity was made still more precise when it was shown [16, 18] that the 
discrete separation theorem, Fenchel duality theorem, and the Minkowski sum theorems hold for submodular 
functions, when seen as analogous to convexity. From a computational perspective, these results have helped 
provide several algorithms for submodular function minimization. In particular, [20, 2] use the submodular 
polyhedron and the convex extension to provide an exact algorithm for submodular minimization. Similarly, 
[50, 24, 23, 46, 25] and others have used many of these ideas to provide exact algorithms for submodular 
minimization. 

While submodular functions are related to concavity (as discussed above), the polyhedral aspects of 
submodular functions from the perspective of maximization (and that we address in this paper) have not been 
nearly as well studied. Most work on submodular maximization has been on the exploration of approximation 
algorithms. The first set of results for submodular maximization were shown in [45, 44], where they provide 
a 1 — 1/e approximation algorithm (in the form of a simple greedy heuristic) for maximizing a monotone 
submodular function under a cardinality constraint. Further variants of the greedy algorithm were also 
extended to matroid and knapsack constraints [15, 52, 36, 39]. The factor 1 — 1/e was also shown to be 
optimal under the value oracle model [11, 44]. The first systematic study on non-monotone submodular 
function maximization was performed by Fiege et al [12], where they obtain a 1/3 and a randomized 2/5 
approximation for unconstrained submodular maximization. They also show an absolute hardness of 1/2 
for this problem. They raised an open question, however, whether there exists a tight 1/2 approximation 
algorithm for this problem. This question was resolved in [4], where they show that a simple randomized 
linear time algorithm achieves an approximation factor of 1/2 in expectation. Many of these results can be 
extended to matroid and knapsack constraints in [37, 38]. 

Polyhedral aspects of submodular maximization and the concave extension of a submodular function have 
been studied but only in a relatively limited context [12, 5, 54, 9, 45, 27, 26, 33, 35, 3]. For example, a 
recent chain of work by Jan Vondrak and others [5, 54, 9] investigated concave extensions of a submodular 
function, which were shown to be NP hard to evaluate [54]. Similarly the submodular semidifferentials 
has gained a lot of attention from the machine learning community. In particular, the subgradients and 
supergradients of a submodular function have inspired a unifying Majorization-Minimization framework for 
submodular optimization [43, 27, 26, 33, 35, 28, 30]. These semidifferentials have also been used in the context 
of approximate inference in a class of probability distributions defined via submodular functions [29, 8], and 
have also been used to define a class of Bregman divergences using submodular functions [27]. 

In this paper, we attempt to provide a first unifying characterization of the concave aspects of submodular 
functions from a polyhedral perspective, thereby extending many of the observations made in [40]. In this 
effort, we discover a number of interesting connections between these different aspects of submodular functions 
connecting concavity, and contrast them to known results of submodularity and convexity. 

1.2 Our Contributions 

The main contributions of this work is in providing the first systematic theoretical study related to polyhedral 
aspects of submodular function maximization and connections to concavity. The following provides a summary 
of the main components and contributions of this paper. 

• We show that submodular functions have tight modular (additive) upper bounds, thereby proving 
the existence of the superdifferential of a submodular function. We show that characterizing this 
subdifferential is NP hard in general. However, we provide a series of (successively tighter) outer and 
also inner polyhedral bounds, all obtainable in polynomial time, and also show that we can obtain 
some specific practically useful supergradients in polynomial time. Along the way, we relate this to 

concave submodular functions [41] defined on 2 V . 

• We also extend the notion of the submodular polyhedron (which consists of the set of modular lower 
bounds of a submodular function, and for reasons that will become clear, we will refer to as the 
“submodular lower polyhedron”). We then define the submodular upper polyhedron (which consists of 
the set of modular upper bounds of the submodular function). 
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• We define the concave extension of a submodular function, in a manner similar to the convex extension, 
namely as a linear program over the submodular upper polyhedron. We show that this is identical to 
the concave extensions considered in the past [5, 54]. We also provide a family of concave extensions 
based on bounds on the submodular upper polyhedra, some of which can be efficiently computed in 
polynomial time. We relate these extensions to submodular function maximization. 

• We then show how we can define forms of optimality conditions for submodular maximization through 
the submodular superdifferential. We also show how optimality conditions related to approximations 
to the superdifferential lead to a number of familiar approximation guarantees for these problems. 

• Finally we study the Fenchel duality and discrete separation theorems for submodular functions seen 
in connection to concavity. While in general this does not hold, we show that these hold under certain 
quite mild conditions. We also show how the Minkowski-Sum theorem also holds under certain restricted 
conditions. 

• Throughout this paper, we point to interesting connections regarding how our results generalize many 
of the results of M^-concave submodular functions [41] on 2 V , where many of these characterizations 
are exact. 

1.3 Road-Map of this paper 

In Sections 2, 3, 4 and 5, we review the connections between submodularity and convexity. Most of the results 
in these sections are from [40, 20], and in some cases we provide some generalizations. In Section 2, we review 
polyhedral aspects of submodularity and convexity, and investigate the submodular polyhedron, submodular 
subdifferentials, etc. In Section 3, we study the convex extensions of a submodular function, while in Section 4 
we review the optimality conditions of submodular function minimization from a polyhedral perspective. In 
Section 5, we review the discrete separation theorem , the Fenchel duality theorem, and the Minkowski sum 
theorem, all from the perspective of the convex analogy of submodular functions. In Section 6 we define and 
investigate the polyhedral aspects of submodularity and concavity — we do this by defining the submodular 
upper polyhedron and the submodular super differentials. In Section 7, we provide a characterization of the 
concave extension of a submodular function. In Section 8, we study the optimality conditions of submodular 
function maximization from a polyhedral perspective. Finally, in Section 9, we provide versions of the discrete 
separation theorem, the Fenchel duality theorem and the Minkowski sum theorem but from the perspective 
of concavity of a submodular function. 


2 Polyhedral aspects of Submodularity and Convexity 

Most of the results in this section are covered in [10, 40, 20] and the references contained therein, so for more 
details please refer to these texts. We use this section to review existing work on the polyhedral connections 
between submodularity and convexity and to help contrast these with the corresponding results on the 
polyhedral connections between submodularity and concavity starting in Section 6. 

2.1 Submodular (Lower) Polyhedron 

For a submodular function /, the submodular (lower) polyhedron 2 and the base polytope of a submodular 
function [20] are defined, respectively, as: 

V f = {x&R v :x(S)<f(S),WSCV} and B f = V f n {x € : x{V) = f(V)} 7 (3) 

2 Since the submodular polyhedron consists of modular lower bounds of a submodular function, we shall also call it the 
submodular lower polyhedron to contrast with the submodular upper polyhedron we introduce in Section 6.1. 
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Figure 2: The Submodular Polyhedron Vf and the Base Polytope Bf in two dimensions 

where x(S ) = Ylies Xi f° r an y G CV. The submodular polyhedron has a number of interesting properties, 
one important one being that the extreme points and facets can easily be characterized even though the 
polyhedron itself is described by a exponential number of inequalities. In fact, surprisingly, every extreme 
point of the submodular polyhedron is an extreme point of the base polytope. These extreme points admit 
an interesting characterization in that they can be computed via a simple greedy algorithm [10] - let 
<7 be a permutation of V = {1, 2, • • • , n}. Each such permutation defines a chain with elements Sq = 0, 
Sf = {cr(1), cr(2),..., cr(i)} such that Sq C Sf C • • • C Sf. This chain defines an extreme point h a of Vf 
with entries 

h”{a{i)) = f{S”)-f{SU)- (4) 

Each permutation of V characterizes an extreme point of Vf and all possible extreme points of Vf can be 
characterized in this manner [20]. Furthermore, the problem max ye -p / y T x, which is a linear program over a 
submodular polyhedron, can be very efficiently computed through the greedy algorithm [10]. The following 
lemma gives the greedy algorithm for finding this. 

Lemma 2.1. [10, 40] Given a vector w £ K” , consider a permutation a w , such that w[(j u ,(l)] > w[a w (2)] > 
••• > w[a w (n)]. Define s*(a w (i)) = f{Sf w ) - /(Sf-i) f or * e {1, 2, • • • ,n}. Then argmax sgP/ w T s = 
argmax sgB/ w T s 9 s*. Furthermore, max sgP/ w T s = YL 1 i=i w ( CJ w{i))[f {S'*™) - /{S^T J 

It is immediate that the optimizers s* above form extreme points of the submodular polyhedron. Also, 
given a submodular function / such that 3 /(0) = 0, the condition that x £ Vf can be checked in polynomial 
time for every x — this follows directly from the fact that submodular function minimization is polynomial 
time. 

Proposition 2.2. Given a submodular function f, checking if x £ Vf is equivalent to the condition 
mm.x<zv[f{X) — x(X)] > 0, which can be checked in poly-time. 

2.2 The Submodular Subdifferential 

Another aspect of the connection between submodular functions and convexity is the submodular subdifferen¬ 
tials [17]. The subdifferential df(X) of a submodular set function / : 2 V —> R. for a set X C V is defined 
[17, 20] analogously to the subdifferential of a continuous convex function: 

d f {X) = {x £ K" : /(F) - x(Y) > f(X) - x{X) for all Y C V} (5) 

The polyhedra above can be defined for any (not necessarily submodular) set function. When the function 
is submodular however, it can be characterized efficiently. Firstly, note that for normalized submodular 

3 Any set function h is said to be normalized if /?(0) = 0. 


5 






Figure 3: The Subdifferentials df(Y) of a submodular function for different sets Y in two dimensions. Notice 
that the subdifferentials partition the space ]R 2 . In this case, V = 


functions, for any hx G df(X), we have f(X) — hx{X) < 0 which follows by the constraint at Y = 0. Like 
the submodular polyhedron, the extreme points of the submodular subdifferential also admit interesting 
characterizations. We shall denote a subgradient at X by hx £ d/{X). Similar to the submodular polyhedron, 
the extreme points of df(X), for any X , may be computed via a greedy algorithm as follows: let er be a 
permutation of V that assigns the elements in X to the first \X\ positions (i < |W| if and only if cr(i) £ X) 
and Swi = X. An illustration of this is shown in Figure 4. 


X 



Figure 4: A visualization of a permutation a = (<r(l), cr(2),...) of V and the chain of sets 
according to this permutation, where S° = {a(l), cr(2),... ,a(i)}. Here, also, we show the permutation is 
compatible with X = SW-| with \X\ =4. 

This chain defines an extreme point h a x of df(X) with entries 

= i). (6) 

Note that for every subgradient hx £ Of(X) we can define a modular function 

m x (y)±f(X) + h x (y)-h x {X) (7) 

that is defined VY C V, and that is a tight lower bound of / — that is, rnx satisfies nix(Y) < f(Y),VY C V 
and we have that mx{X) = f(X). Hence, the subdifferential corresponds exactly to the set of tight modular 
lower bounds of a submodular function, at a given set X. If we choose hx to be an extreme subgradient, the 
modular lower bound becomes mx(Y) = hx(Y ), resulting in a normalized modular function (i.e., mx(0) = 0). 
Also, if X = SJ for some j, then since h’x(Sf) = f(Sf) for all i, the modular lower bound defined as 
m x (Y) = h x (Y) has the property that it is tight for all sets {5f }i, not just X. 

The subdifferential defined in Eqn. (5) is defined via an exponential number of inequalities. A key 
observation however is that many of these inequalities are redundant. We define three polyhedra: 

d}(X) = {i £ R" : f(Y) - x(Y) > f(X) - x(X),VY C X} (8) 

d}(X) = {i £ I" : f(Y) - x(Y) > f(X) - x(X),VY D X} (9) 

dj(X) = {i € R" : f(Y) - x{Y) > f(X) - x(X),VY :Y^X,Y^X} (10) 
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We immediately have that df(X) = dj(X) f] d‘j(X) ndj(X). The following lemma shows that the inequalities 
in dj(X) are redundant in characterizing 9/(A) when given dj(X) and 9/(A). 

Lemma 2.3. ([20, Lemma 6 . 4 ]) Given a submodular function f, 9/(A) = dj(X) fl9/(A'). Hence, 

d f (X) = {x £ r : f(Y) - x(Y) > f(X) - x (A),VF £ [0, A] U [X,V]} (11) 

In the above, [A, B] = {A' C V : A C AC B} whenever AC B. We thus see that for X ^ {0, V}, many 
of the inequalities defining 9/(A) in Eqn. (5) are in fact redundant. 

The subdifferential at the emptyset has a special relationship since 9/(0) = Vf. Similarly df(V) = Vf#, 
where /#( X) = f(V ) — f(V\X ) is the submodular dual of /. Furthermore, since /# is a supermodular 
function, it holds that df(V) is a supermodular polyhedron (for a supermodular function g, the supermodular 
polyhedron is defined as V g = {x £ : x(X) > g( A),VA C V}). 

The following lemma shows another instructive fact about the subdifferentials: 

Lemma 2.4. ([20, Lemma 6.5]) For any submodular function /, 9/(A) = 9/x(A) x 9/ x (0), where f x {Y) = 
f(Y),VY C X, and fx(Y) = f(Y U A) — /(A), \/Y C V\A, and x denotes the direct product. 

Finally we define what we call the local approximation of the subdifferential as follows: 

df X1) (X) = {x £ : Vj € A, /(j|A\j) < x(j) and Vj $ A, f(j |A) > x(j)}. (12) 

Notice that 9^ 1,1 )(A) D 9/(A') since we have fewer constraints here than in the original subdifferential. In 

particular d^ f ' 1,1 \X) considers only n inequalities by choosing the sets Y in Eqn. (11) such that | Y A A| = 1 
(i.e., Hamming distance one away from A). This polyhedron will be useful in characterizing local minimizers 
of a submodular function (see Section 4) and motivating analogous constructs for local maxima (see, for 
example, Proposition 8.2). 

2.3 Generalized Submodular Lower Polyhedron 

In this section, we define a generalization of the submodular polyhedron, which we call the generalized 
submodular lower polyhedron. While this construct has not been defined explicitly before, we investigate it 
primarily with the aim of contrasting this with results on the concave polyhedral aspects of a submodular 
function that we explore in Section 6. 

Define the generalized submodular lower polyhedron as follows: 

Vf n [x{X) + c] < /(A),VA C V}. (13) 

This generalized polyhedron V[ en C ]R” +1 intuitively captures the affine (or unnormalized) modular lower 
bounds of /. The definition above holds for any arbitrary set function, not necessarily submodular, in 
which case we call it the generalized lower polyhedron. In the case of submodular functions, this generalized 
lower polyhedron has interesting connections to the submodular polyhedron. In particular, note that 
P| cn D{(a;,c) : c = 0} = {{x,c) : x £ Vf, c = 0}. In other words, the slice c = 0 of the generalized submodular 
polyhedron is the submodular polyhedron of /. Also notice that for a normalized submodular function /, the 
constraint at A = 0, requires that c < 0. 

The generalized polyhedron has interesting connections with the subdifferential - the following is a 
characterization of its facial structure. 

Lemma 2.5. Given a set function f, a given point (x, c) £ Vj. en lies on a face of the polyhedron P® en if and 
only if there exists a set X such that x £ 9/(A) and c = /(A) — x{X). 

Proof. Notice that (x, c) lies on a face of P® en if and only if there exists a set A such that x(X) + c = /(A) 
and for all Y C V, x(Y) + c < f(Y). Since then x(Y) — x(X) < f(Y) — /(A), we have that x £ 9/(A) and 
c = /(A) — x(X) that, as mentioned above, has c < 0 when / is submodular. □ 
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Generalized Lower Submodular Polyhedron 



Figure 5: The generalized submodular lower polyhedron for a two dimensional submodular function / : 
2 {1 ’ 2} K, satisfying /(0) = 0, /({!}) = 1, /({2}) = 2, /({1,2}) = 2.5. 


The extreme points of 'P| cn also are easy to characterize when / is submodular. Surprisingly, all the 
extreme points lie exactly on the hyperplane c = 0 with x being the extreme points of Vf . 

Lemma 2.6. Given a submodular function f, (x, c ) is an extreme point ofVj en if and only if x is an extreme 
point ofVf and c = 0. Furthermore, for any y £ R™, 

max [(a",y) + c] = max{ max \{x,y) + f(X ) — x{X)\ \ X CV} = rna x(x,y) (14) 

{x,c)£V a , m xedf(X) ' x&Vf 

. 1 . v v \__/ 

V -V/-' v v 

(%) (a) m 

Proof. First we show that (i) = (ii). Notice that a maximum in (i) (denoted by (x*,c*)) occurs at a face of 
"P® 611 which, by Lemma 2.5, implies that there exists an X such that x* £ df(X) with c* = f(X) — x(X). 
This subdifferential is considered in (ii) ’s outer max implying (ii) > (i). Moreover, from the definitions of the 
generalized submodular lower polyhedron and the subdifferential, we see it is the case that for any X C V 
and for any x £ dj(X), the point (x, f(X) — x(X)) £ 7 , | en . Hence (ii) < (i), since the max in (i) is over a 
much larger superset. 

We then show that (i) = (iii). It is immediate that (iii) < (i) since (iii) is a more constrained case of (i) 
under c = 0. Next, we show (iii) > (i), which states that for a submodular function, the linear program over 
the generalized submodular polyhedron is equivalent to a linear program over the submodular polyhedron. 
This result follows as a corollary from Lemma 2.1. Specifically, for any ( x,c ) £ "P® en , we have that 

max w T s = V \if(Sf w ) > Y] Ai[(x, l s -») + c] > {x,w} + c, (15) 

i i 

where the last inequality follows from the facts that AMs' 7 ™ = w and \ = 1. In particular, this also 
means that in the optimization problem in ii, the maximum over X C V occurs at X = 0, when df(X) = df. 

Lastly, note that since every linear program over the generalized submodular polyhedron can be cast as 
a linear program over the submodular polyhedron, the extreme points of both polyhedra must also be the 
same. □ 







Intuitively, ( x,c ) is an extreme point if x is an extreme point of a subdifferential df(X) for some set X. 
Since the extreme points of the subdifferentials are exactly the extreme points of the submodular polyhedron, 
the result follows. 

Finally, it is worth mentioning that similar to the submodular polyhedron, the generalized submodular 
polyhedron membership problem (i.e., does ( x , c) £ "P® en ) is polynomial time, and can be solved via submodular 
minimization. This is again similar to the case for the submodular (lower) polyhedron. 

Proposition 2.7. Given a submodular function f, ( x,c ) £ Pj en if and only if c < minxcv[/(A) — a;(A')]. 
Since submodular minimization is polynomial time, the generalized submodular polyhedral membership problem 
is also polynomial time. 

A visualization of the generalized submodular lower polyhedron for a submodular function on V = {iq, V 2 } 
is shown in Figure 5. 

3 Convex extensions of a Submodular Function 

We now describe the convex extension of a submodular functions. We shall see a number of equivalent ways 
to characterize this extension and observe how they can be computed very efficiently as what is known as 
the Lovasz extension [10, 40]. The results of this section are mainly taken from [10, 40, 9, 54] and are given 
here both for completeness and also to help contrast with the results we will show for the various concave 
extensions given in Section 7. 

Following [54, 9], we consider two main characterizations of the convex extensions, as what we call 
polyhedral characterization and distributional characterization. The main purpose of this section is to review 
existing work thereby making it easy to contrast these results with the new ones, on the concave extensions 
of submodular functions, we shall present in Section 7, 

3.1 Polyhedral characterization of the convex extensions 

The convex extension of any set function (not necessarily submodular) can be seen as the pointwise supremum 
of convex functions which lower bound the set function [9, 54, 3). Precisely, let 

$>f = {(f : <f> is convex in [0,1]' and 4>{lx) < /(A ),MX Q V}. (16) 

be the set of continuous convex functions on [0, if that lower bound /(•). Then define the convex extension 
/ : [0,1]AI -> K as follows: 

f{w) = max for w £ [0,11” (17) 

0e5>/ 

It is not hard to show that / is convex and satisfies the relation f(lx) = /(A). The above expression can in 
fact be simplified for any set function, and it suffices to consider affine lower instead of convex lower bounds. 
In particular Eqn. (17) can be expressed as a linear program over the generalized polyhedron. 

Lemma 3.1. Given a set function f, the convex extension of f in Eqn. (17) can be expressed as: 

f(w)= max [(i,w)+c],V«)£ [0,1]" (18) 

(rE,c)G'P® e ’'* 

Proof. The proof of the equivalence follows from a simple observation. For a given w, let (f> be an argmax in 
Eqn. (17). Then since cf> is a convex function in [0, l] 1 , there exists a subgradient x £ R n at w and value d, 
such that ( x , y) + d < cf>(y),\/y and ( x , w) + d = (j>{w). In other words, ( x , y) + d , seen as a function of y, is a 
linear lower bound of 4>{y) and that is tight at w. Hence, at value w, f(w) takes value (x,w) + d. Finally 
notice that (x, d) £ "P® en since (x, lx) + d = x(X) + d < </>(1.y) < /(A), VX C V. □ 
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In the above, we have so far not yet invoked the submodularity of /, something that can lead to great 
simplifications. If / is submodular, then the above polyhedral characterization can be replaced by an linear 
program over the submodular polyhedron. In other words, 

Lemma 3.2. For a submodular function f, the expressions in Eqn. (17), (18) can be rewritten as: 

f(w) = ma x(x,w),Vw £ [0,1]". (19) 

xGVf 

Proof. This follows directly from Lemma 2.6. □ 

Hence, when / is submodular, we may assume c = 0 in Eqn. (18). The above result is not surprising given 
that the extreme points of 'P| en are identical to the extreme points of P/ when / is submodular. 

3.2 Distributional characterization of the convex extension 

Another way to characterize the continuous extension of a set function / is as follows. For a given w £ [0,1]", 
denote A w as the set: 

A* = {{A s , SCI/}: ^2 Asls =w,J2 X s = l, and VS, A s > o}. (20) 

scv scv 

Then the convex extension / can be equivalently written as: 

f(w) = min ^ X sf(S) (21) 

scv 

The reason this representation is called distributional is that the convex extension here is computed by 
minimizing over particular distributions over sets. Again, it is not hard to see that this characterization is a 
convex extension. 

For a submodular function, the distribution characterization takes on a nice form, which is known 
classically as the Lovasz extension. This result can be found, for example, in [9, 54]: 

Lemma 3.3. [9, 40, 10] Given a submodular function f, 

n n— 1 

K w ) = ^2,w{a w {i)){f{Sf w ) - /(Sf”i) = w(a w (n)))f(S^) + ^(w(cr u ,(*)) - w(a w (i + l)))/(Sf“), (22) 

i=1 i=l 

where a w is a permutation satisfying w(a w ( 1)) > w(a w (2) > ■ ■ ■ > w(a w (n)). 

It is clear from above that the minimizing distribution A is a form of a chain distribution, where the chain 
here is the sequence of sets S'q”, S] w , • • • , defined in Lemma 2.1. We also see the relationship between 
the two characterizations in the case of submodular functions, since Eqn. (22) is exactly the solution of the 
linear program over the submodular polyhedron (see Lemma 2.1). Hence the two forms of convex extensions, 
i.e the distributional characterization from Lemma 3.3 and polyhedral characterization from Lemma 3.2, are 
identical for a submodular function. The resulting convex function / is the Lovasz extension. 

The equivalence between the two characterizations holds for general set functions, not necessarily submod¬ 
ular. In other words, Eqn. (21) and Eqn. (17), (18) are identical for any set function. This follows directly 
from the arguments in [9, 54], The only catch, however, is that Lemmas 3.2 and 3.3 do not hold for general 
set functions and / can be NP hard to evaluate in general [9, 54, 3]. 
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3.3 Convex Extensions and Submodular Minimization 

The Lovasz extension plays an important role in submodular minimization. In particular, minimizing the 
Lovasz extension is equivalent to minimizing a submodular function: 

Lemma 3.4. [40]) Given a submodular function f, 

min f(X)= min f(x) (23) 

xc.v me [o,i]" 

Furthermore, given the minimizer x* of the RHS above, we can obtain a set X* such that f(X*) = f(x*). 

This implies that unconstrained submodular minimization has an integrality gap of one and the two 
problems are equivalent. 


4 Optimality conditions for submodular minimization 

Fujishige [19] provides some interesting characterizations of optimality conditions for unconstrained submodular 
minimization. The following theorem can be thought of as a discrete analog to the KKT conditions: 

Lemma 4.1. ([20, Lemma 7.1]) A set A C V is a minimizer o//:2' , -)Ri/ and only if: 

0ed f (A) (24) 


This immediately provides necessary and sufficient conditions for optimality of /: 

Lemma 4.2. ([20, Theorem 7.2]) A set A minimizes a submodular function f if and only if f(A) < f(B) 
for all sets B such that B C A or A C B. 


In other words, it is sufficient to check only the subsets and supersets of A to ensure that A is a global 
optimizer of /. The above Lemma follows from Eqn. (11) and Lemma 4.1. Analogous characterizations have 
also been provided for constrained forms of submodular minimization, and interested readers may look at [19]. 
Finally, we can provide a simple characterization on the local minimizers of a submodular function. 


Lemma 4.3. A set AC V is a local minimized of a submodular function if and only if 0 £ 


As was shown in [31], a local minimizer of a submodular function, in the unconstrained setting, can be 
found efficiently in 0(n 2 ) complexity. 

While unconstrained submodular minimization is easy, most forms of constrained submodular minimization 
become NP hard. For example, a simple cardinality lower bound constraint makes the problem of submodular 
minimization (even with monotone submodular functions) NP hard without even constant factor approximation 
guarantees [53]. These results, however, can be extended when the constraints are lattice constraints [20] in 
which case many of the results above still hold. 


5 Convex Characterizations: Discrete Separation Theorem and Fenchel 
Duality Theorem 

We next review some interesting theorems that characterize convex functions, and that interestingly also hold 
for submodular functions. 

4 A set A is a local minimizer of a submodular function if f(X) > /(/I). \/X : X\A < 1, and .4 \X = 1, that is all sets X 
no more than hamming-distance one away from A. 
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5.1 The Discrete Separation Theorem (DST) 

The separation theorem [47], known in context of convexity, states that given a convex function 0 and a 
concave function ip such that Wx, <p{x) > 0(x), there exists an affine function (h, x) + c such that Vx, ip(x) ^ 
(h, x) + c> ip{x). 

A similar relation holds for submodular functions. The lemma below was shown by Frank [16] and has 
become known as the discrete separation theorem (DST): 

Lemma 5.1. [16], [20, Theorem 4-12] Given a submodular function f and a supermodular function g such 
that /(A') > 17 (A), VA (and which satisfy /(0) = g(0) = 0,), there exists a modular function h such that 
/(A) > h(X) > g(X). Furthermore, if f and g are integral so may be h. 

This Lemma can also be shown using the Lovasz extension. In particular, given a submodular function 
/ and a supermodular function g such that /(A) > g{ A),VA, we can construct the convex and concave 
extensions / and g of / and g (the concave extension g can be constructed via the Lovasz extension of —g). 
From the expressions of / and g, it is not hard to see that /(x) > g(x),\/x. Hence using the separation 
theorem from convex analysis, we can find a linear function h , which when restricted to 0/1 vectors, gives the 
modular function h. 

The DST is one of the results that shows how submodular functions are analogous to convex functions. 
Surprisingly, we will show in Section 9.1 that a form of opposite (and slightly restricted) DST also holds for 
submodular functions that relates submodularity to concave functions. 

5.2 Fenchel Duality Theorem (FDT) 

The Fenchel duality theorem in the context of convexity [47] provides a relation between the minimizers of 
the function and it is dual. Given a convex function 0 and a concave function ip, the Fenchel dual 0* of 0, 
and ip* of ip, is given as follows: 

<p*{y)= max \(x,y) — 0(x)] and ip*{y) = min [(x,y) - ip(y)]. (25) 

£€Edom(</>) a?6dom('i/>) 

The dual functions 0* and ip* are convex and concave respectively. The Fenchel duality theorem then states 
that: 


min[ 0 (x) — 0 (x)] = max[ 0 *(y) — 0 *(y)] (26) 

x y 

Analogous characterizations also hold for submodular functions [19]. Given a submodular function / (or 
equivalently supermodular function g), the Fenchel dual f* of /, and g* of g, are defined as follows: 

f*{x) = max[a;(A) - /(A)] and g*(x) = min[a;(A) - g( A)]. (27) 

y\. v— V v\- 1— V 

The Fenchel duals f* and g* are convex and concave functions respectively. Then the following Lemma for 
submodular functions, analogous to the case for convex and concave functions, holds: 

Lemma 5.2. ([20, Theorem 6.3]) Given a submodular function f and a supermodular function g, 

min [/(A) - 5 (A)] = max[g*(x) - f*(x)}. (28) 

XCV x. 

Further if f and g are integral, the maximum on the right hand side is attained by an integral vector x. 

5.3 The Minkowski sum theorem 

Submodular polyhedra and also the subdifferentials have an interesting characterization related to Minkowski 
sums of polyhedra, namely P + Q = {x + y:x&P and y £ Q} for polyhedra P and Q. 
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Figure 6 : The submodular upper Polyhedron V? in two dimensions. 

Lemma 5.3. ([20, Theorem 6.8]) Given two submodular functions f± and f 2 , it holds that the addition of 
the polyhedra corresponds to a point-wise addition. That is: 

Vf 1+ f 2 = Vf 1 + Pf 2 , and more generally, df 1+ f 2 (X) = df 1 (X) + df 2 (X) (29) 

Similarly it holds that 'P 9 f[+f 2 — V 9 [ n + V 9 ^ n . 

The Minkowski sum theorem for the generalized submodular polyhedron follows directly from the definition. 


6 Concave Polyhedral Aspects of Submodular Functions 

We next investigate several polyhedral aspects of submodular functions relating them to concavity, thus 
complementing the results from Section 2. This provides a complete picture on the relationship between 
submodularity, convexity, and concavity. We define and investigate the submodular upper polyhedron, 
submodular superdifferential, and the generalized submodular upper polyhedron. 

6.1 The submodular upper polyhedron 

A first step in characterizing the concave aspects of a submodular function is the submodular upper polyhedron. 
Intuitively this is the set of tight modular upper bounds of the function, and we define it as follows: 

V f = {xeR n :x(S)> f(S),\/S C V} (30) 

The above polyhedron can in fact be defined for any set function. In particular, when / is supermodular, we 
get what is known as the supermodular polyhedron [20]. Presently, we are interested in the case when / is 
submodular and hence we call this the submodular upper polyhedron, a construct that is quite different than 
the supermodular polyhedron. 

Interestingly, submodular upper polyhedron has a very simple characterization due to the submodularity 
of /. We have the following: 

Lemma 6.1. Given a submodular function f, 

V f = {xeR n : x{j) > f(j)} (31) 

Proof. Given x € V* and a set S, we have x(S) = J2ies x ( i ) ^ E ie s/W> since Vi,x(z) > f(i) by Eqn. (30). 
Hence x(S) > £. ieS f(i) > f(S). Thus, the irredundant inequalities are the singletons. □ 

The lemma states that this polyhedron is not polyhedrally tight in that the vast majority of the defining 
inequalities are redundant. Unlike the submodular lower polyhedron, the submodular upper polyhedron is not 
particularly interesting or useful for defining a concave extension. We shall, however, define a generalization of 
the submodular upper polyhedron in Section 6.3 that will prove quite useful in characterizing, and providing 
approximations to, the concave extension of /. 


13 







We end this section by investigating the submodular upper polyhedron membership problem. Owing 
to its simplicity, this problem is particularly simple which might seem surprising at first glance since, from 
Eqn. (30), the problem of checking x £ V* is equivalent to checking if rnax^cv /(A') — x(X) < 0. In general, 
this would involve the maximization of a submodular function which is NP hard. The following lemma shows 
that this particular problem is actually easy. 

Corollary 6.2. Given a submodular function f and vector x, let X be a set such that /(A) — x(X) > 0. 
Then there exists an i € X : f(i) — x(i) > 0. 

Proof. Observe that /(A) — x(A) < J2iex /(*) — x{i). Since the l.h.s. is greater than 0, it implies that 
J2i<E \ /(*) — x(i) > 0. Hence there should exist an i £ X such that /(*) — x(i) > 0. □ 

Thus, it is sufficient to check the singleton values, i.e /(?') — x(i), and if all these are less than or equal to 
zero, then x £ V?. This also follows immediately from Lemma 6.1. 

An interesting corollary of the above is that it is easy to check if the maximizer of a submodular function 
is greater than or equal to zero. Given a submodular function /, the problem is whether maxxcv /(A) > 0. 
This can easily be checked without resorting to submodular function maximization. 

Corollary 6.3. Given a submodular function f with /(0) = 0, maxxcv /(A) > 0 if and only if there exists 
an i £ V such that f(i) > 0. 

Proof. If for any j, f(j) > 0 it implies that maxxcv /(A) > f(j) > 0. On the other hand, if Vj, f(j) < 0, 
we have that VA C V) /(A) < /(*) < 0- Hence maxxcv /(A) = 0. □ 

This fact is true only for a submodular function. For general set functions, even when /(0) = 0, it could 
potentially require an exponential cost search to determine if maxxcv /(A) > 0. 

6.2 The Submodular Superdifferentials 

Given a submodular function /, we can characterize its superdifferentials that constitute a partition of R". 
Given any A C V , we denote superdifferential with respect to A as d^(X) and define it as follows: 

d f (X) = {i e R" : f(Y) - x(Y) < /(A) - x(X),VY C V) (32) 

This characterization is analogous to the subdifferential of a submodular function defined in Eqn. (5). This 
is also akin to the superdifferential corresponding to a continuous concave function. Since, as we will see, 
submodular functions have both a subdifferential and a superdifferential structure that are distinct, the two 
names will, correspondingly, refer to distinct constructs. 

Each supergradient gx £ d^(X) defines a modular upper bound of a submodular function. In particular, 
define the following modular function: 

m x (Y) 4 /(A) + g x (Y) - g x ( A). (33) 

Then m x {Y) is a modular function which satisfies m x (Y) > f(Y),VY C A and m x { A) = /(A). This is 
analogous to the submodular subdifferential in Section 2.2 (and in particular Eqn. (7)) where tight modular 
lower bounds were produced — here, we produce tight modular upper bounds on any submodular function. 

We note that (x(vi),x(y 2 ), ..., x(v n )) = {f(v i), f(v 2 ),..., f{v n )) £ c^(0) which shows at least that d* (0) 
exists. A bit further below (specifically Theorem 6.8) we show that for any submodular function, d* (A) is 
non-empty for all A’CV. 

Note that the superdifferential is defined by an exponential (i.e., 2l y l) number of inequalities. However 
owing to the submodularity of / and akin to the subdifferential of /, we can reduce the number of inequalities 
since some of them are redundant given the others. Define three polyhedrons as follows: 

d[ (A) = {1 £ R“ : f(Y) - x(Y) < /(A) - x(X),VY C A}, (34) 

d{ (A) = { 11 R“ : f(Y) - x(Y) < /(A) - ®(A), VF D A}, (35) 

d f 3 (X) = {1 £ R“ : f(Y) - x(Y) < /(A) - x(X ),VF :Y % X,Y fb A}. (36) 
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An immediate observation is that 


df(X) = d{(X)ndl(X)ndi(X). (37) 

As we show below for a submodular function /, d((X) and d((X) are actually very simple polyhedra. 
Lemma 6.4. For a submodular function f, 

d{(X) = {xeR n : f(j\X\j) > x(j),Vj G X} (38) 

d((X) = {i £ 1" : f(j\X) < i A'}. (39) 


Proof. Consider d( (A). Notice that the inequalities defining the polyhedron, starting from Eqn. (34), can 
be rewritten as d{ (A) = {x G R ra : x(X\Y) < /(A) — f(Y),\/Y C A}. We then have that x(X\Y) = 
J2 je x\Y x U) ^ 12jex\y since Vj e X > X U) < (this follows by considering only the 

subset of inequalities of d{( A) in Eqn. (34) with sets Y C A such that |A\E| = 1). We then have 
x(X\Y ) < J2jex\Y f{j\X\j) < /(A) — f(Y) where the last inequality follows from submodularity alone. 

Hence an irredundant set of inequalities include those defined only through the singletons. 

In order to show the characterization for d[(X), we have, starting from Eqn. (35), that d[ (A) = {a; G 
R n : x(Y\ A) > f(Y) - f(X),YY 2 A}. It then follows that, z(F\A) = '£ jeY \x x (j) > J2 je Y\x 
since Vj ^ A ,x(j) > f{j\X). Hence x(X\Y) > fU\ x ) — f(X) ~ /(A), and again, an irredundant 

set of inequalities include those defined only through the singletons. □ 

The above characterization shows that d’ (A) can be determined using many fewer inequalities since the 
polytopes d((X) and d( (A) are so simple. Recall that this is analogous to the submodular subdifferential, 
where again owing to submodularity the number of essential inequalities can be reduced significantly — in 
that case, we just need to consider the sets Y which are subsets and supersets of A. It is interesting to note 
the contrast between the redundancy of inequalities in the subdifferentials and the superdifferentials. In 
particular, here, the inequalities corresponding to sets Y being the subsets and supersets of A are mostly 
redundant, while the irredundant ones are the rest of the inequalities. In other words, in the case of the 
subdifferential, dj(X) and 9|(A) were essential, while dj(X) was entirely redundant given the first two. In 

the case of the superdifferentials, d{(X) and d[(X) are mostly internally redundant (they can be represented 
using only by n inequalities), while <9g (A) has no redundancy in general. 

In order to gain more intuition for the superdifferentials, we next consider some examples in both two and 
three dimensions. 

Example 6.1. See Figure 7. We consider here superdifferentials of a submodular function when V = {1,2}. 

Then from the lemma above, 9^(0) = {x G R 2 : f(j |0) < x(j),\/j G {1,2}}. Similarly 9^({1,2}) ={igR 2 : 
f(j\V\j) > x(j),Vj G {1,2}}. Now consider 9-^(1). Then the governing inequalities for this are: 

9-^(1) = {x G R 2 :x\ < /({ 1 }), (40) 

* 2 >/({2}|{l}), (41) 

*i-*2</({1})-/({2}) (42) 

The extreme points of this polyhedron are the vectors {/({1}), /({2})} = {/({1}|0), /({2}|0)} and {/({ 1} | {2}), /({2}|{1})}. 
The way we obtain the extreme points is as follows. Setting the inequalities (40) and (42) as equalities, 
we get the extreme point {/({l}),/({2})}. The inequality (41) then is X 2 = /({ 2}) > /({2}|{1}), which 

holds. We then set inequalities (41) and (42) as equalities, which gives X 2 = /({2}|{1}) and x\ = /({1}|{2}), 

thus giving the second extreme point {/({1}|{2}),/({2}|{1})}. One can see that inequality (40) is sat¬ 
isfied. Finally, if we set inequalities (40) and (41) as equalities, we get X\ = /({ 1}),X2 = /({2}|{1}). 

Then we have that x\ — *2 = /({!}) ~ /({2}|{1}) = 2/({l}) — /({1,2}). Inequality (42), then requires, 
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Figure 7: A visualization of the four submodular superdifferentials d‘(Y) for different sets Y in two dimensions 
V = {^ 1 ,^ 2 }, as described in Example 6.1. 

2 /({!}) - /({1,2}) < /({1}) - /({2}) => /({l}) + /({2}) < /({1,2}), which does not hold (unless f is 
trivially modular, in which case the first two extreme points collapse onto the third). Hence the only extreme 
points are the two vectors above. One can similarly investigate d^({2}), which has the same extreme points. 

It is clear from the above example that superdifferentials in the two-dimensional case are easy to find 
and characterize. However this is not the case in three dimensions where the shape of the superdifferentials 
depends strongly on the particulars of the submodular functions — this means that one cannot characterize 
the superdifferential polyhedra knowing only that / is submodular, more information about the specific 
instance is required. 

Example 6.2. Let V = {1,2,3}. Recall that /(0) = 0. Then consider <9^({1}) = {x £ l 3 : f(Y) — x(Y) < 


aj({l}),VE C {1,2,3}} This polyhedron can be represented 

via the following irredundant 

inequalities: 

^({l}) = {1 £ R 3 : ii < /({!}), 

for Y = {0} 

(43) 

* 2 >/({ 2 }|{l}), 

for Y = {1,2} 

(44) 

X 2 -X 1 > /({ 2 }) -/({ 1 }), 

for Y = {2} 

(45) 

Z3>/({3}|{1}), 

for Y = {1,3} 

(46) 

X 3 - X 1 > /({3}) - /({ 1 }), 

for Y = {3} 

(47) 

X 2 + X 3 - Xi > /({ 2 ,3}) - /({!}) 

for Y = {2,3}} 

(48) 


The other two inequalities (for Y = {1,2,3} and Y = {1}} are redundant given the above. We now consider 
the extreme points of this polyhedron. We consider Eqns. (43), (45), (47) with equality, and we obtain an 
extreme point {/({l}),/({2}),/({3})}. R is the case that all other inequalities are satisfied. Consider next 
Eqns. (45), (47), and (48) with equality and we get a potential extreme point {/({l}) + /({2,3}) — /({2}) — 
/({3}),/({2}|{3}),/({3}|{2})}. Observe that :n = /({l}) +/({2,3}) -/({2}) -/({3}) < /({1}), and hence 
Eqn. (43) is satisfied. However X 2 = /({2}|{3}) may be bigger or smaller than /({2}|{1}) (depending on the 
specific submodular function instance) and Eqn. (44) might or might not be violated. Similarly, X 3 = /({3}|{2}) 
is not comparable to /({3}|{1}), and hence Eqn. (46) might or might not be violated. Consequently we cannot 
determine if by combining together Eqns. (45), (47), and (48), we obtain an extreme point, unless we have 
more information about the current submodular function being used. We therefore, from this example, see 
that we cannot hope to find the extreme points both analytically and generically. 

The above example shows that a particular expression (obtained via a combination of inequalities) might 
or might not be extreme, depending on the particular submodular function and its valuation. This is unlike 
the subdifferential, where a certain analytical expression is always extreme for all submodular functions. 
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Thus, unlike the subdifferentials, we cannot expect a closed form expression for the extreme points of 9^(X). 
Moreover, they also seem to be hard to characterize algorithmically. For example, the superdifferential 
membership problem is NP hard. 

Lemma 6.5. Given a submodular function f and a set X : 0 C X C V, the membership problem y G 9^(X) 
is NP hard. 

Proof. Notice that the membership problem y G d?(X) is equivalent to asking maxycy /(F) — y(Y) < 
/(X) — y(X). In other words, this is equivalent to asking if X is a maximizer of f(Y ) — y(Y) for a given 
vector y. This is the decision version of the submodular maximization problem and correspondingly is NP 
hard when 0 C X C V. □ 

Given that the membership problem is NP hard, it is also NP hard to solve a linear program over this 
polyhedron [22, 51]. The superdifferential for the empty set X = 0 and the ground set X = V, however, can 
be characterized easily: 

Lemma 6.6. For any submodular function f such that /(0) = 0, 9-^(0) = {x G : f(j) < x(j),Vj G V}. 
Similarly d? (V) ={i£l“: f{j\V\j) > x (j),Wj G V}. Furthermore, 9^(0) = V* and d^(V) = . 

Proof. Consider X = 0, then 9-^(0) = {.t G K" : f{Y) < a;(Y),VY C V}. Assuming only the |V| inequalities 
for Y = {j} G V gives f(Y) < J2 jG y fti) < Ejer x(j) = x(Y) meaning only the these \V\ inequalities are 
necessary. For X = V, d?(V) = {x G K™ : x(V) — x(Y) < f(V) — f(Y),VY C V}. Assuming only the |V| 
inequalities for Y = V\ {j} gives E j£ v\v X U) < E j& v\Y f(j\ V \ j) ^ f( V \ Y \ Y ) = f(V) ~ /( y )- This 
rest follows directly from the definitions. □ 

This also follows by first noting that in Eqn. (36) we have 9|(0) = d$(V) = R". Then, by using Eqn (37) 
and Lemma 6.4, we have that Eqn. (34) implies 9/(0) = so that 9-^(0) = 9/(0), and that Eqn. (35) implies 
9/(V) = R n so that d f (V) = d{ ( V ). 

As we see from the above, it is hard to characterize the superdifferential of a submodular function. It 
is however possible to provide computationally feasible inner and outer bounds as shown in the following 
subsections. Using these, we can also find certain specific and practically useful supergradients. 

6.2.1 Outer bounds on the superdifferential 

It is possible to provide a number of useful and practical outer bounds on the superdifferential. Recall from 
Lemma 6.4 that d{(X) and d[(X), defined in Eqns.(34) and (35), are already simple polyhedra. We can 
then provide outer bounds on 9/(A') that, together with d{(X) and 9/(X), provide simple bounds on d^(X). 
Define for 1 < k, l < n: 

9/,A( M) W - {i G 1" : f(Y) - x(Y) < /(A) - x(X), 

VY :Y % X,Y fb X, |Y\X| <k- 1, |X\Y| <1-1} (49) 

Note that for 0 C X C V, 9/ A ^ n n )(X) = 9/(X) and 9/ A( . fc ^ (X) D 9/(X) for 1 < k, l < n. We also have 
that 9/ a ^ 1} (X) = R n . We can then define the outer bound: 

9£ (m) Po = 9jf(X) n9/(X) n9 3 / A(M) (X) D 9^(X). (50) 

Observe that 9 A( . fc ^(X) is expressed in terms of 0(n k+l ) inequalities, and hence for a given constant k,l 
we can obtain the representation of 9 A ^ fc ^(X) in polynomial time. We will see that this provides us with a 
hierarchy of outer bounds on the superdifferential: 

Theorem 6.7. For a submodular function f: 
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1. d f A{hl) (x) = d{(x)ndf(x) 

2. VI < k' <k,l<l'< l,d f (X) c d f A(kl) (X) c d f A{k , n (x) c s£ (1>1) (x) 

5- d f AM (X) = df(X). 

Proof. The proofs of items 1 and 3 follow directly from definitions. To see item 2, notice that the polyhedra 

d A ik become tighter as k and l increase finally approaching the superdifferential. □ 

Similar to how Eqn. (12) relates to the submodular subdifferential, we shall call d A ^ ^(T) the local 
approximation of the superdifferential. In particular, 

4(1,1)PO f(Y)-x(Y) < f(X)-x(X),W £ [0,X] U [X,V]} (51) 

= {x £ r : Vi € x, f(j\X\j) > x(j) and Vj $ X, f(j |X) < x(j)} (52) 

where the second equality follows from Lemma 6.4. It is interesting to note that the very same irredundant 

sets that equivalently define the subdifferential in Eqn. (11) are also the ones that define an outer bound of 
the superdifferential in Eqn. (51). 

We shall see in Section 8 that these outer bounds have interesting connections with approximation 
algorithms for submodular maximization. 


6.2.2 Inner Bounds on the superdifferential 

While it is hard to characterize the extreme points of the superdifferential, we can provide some specific and 
useful supergradients. For any XCf, define three vectors gx,gx,gx £ R n as follows: 


and 


9x{j) = 
gx{j ) = 
9x{j) = 


f(j\ x \j) 
f O') 

fU\v\j) 
fti lx) 

fti\v\j) 

fti) 


if j£X 
if j$X' 

if j£X 

if ax' 

if j£X 
if j?X' 


(53) 

(54) 

(55) 


Then we have the following theorem: 

Theorem 6.8. For a submodular function /, gx,gx,gx £ d-f(X). Hence for every submodular function f 
and set X, cT(X) is non-empty. 


Proof. For submodular /, the following bounds are known to hold [45] for all X, Y CV: 


f(Y)<f(X)~ fti\X\j)+ J2 fti\XnY), (56) 

jex\Y jer\x 

f(Y) < /(X) - ^ f{j\X U Y\j) + ^ fti\X) (57) 


j£X\Y jev\x 

Using submodularity, we can loosen these bounds further to provide tight modular upper bounds [1, 34, 33, 
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27, 26]: 


f(Y)<f(X)~ fU\ x \{j}) + E /O' 10) (58) 

j&X\Y j£Y\X 

f(Y)<f(X)~ fU\V\{j})+ E /OW (59) 

iex\v iey\x 

f(Y)<f(X)~ 53 /(i|F\{i})+ £ /O10). (60) 

jex\r jev\x 

From the three bounds above, and substituting the expressions of the supergradients, we may immediately 
verify that these are supergradients, namely that gx,gx,gx £ d*(X). For example, starting with Eqn. (58), 


we have that for all Y C F: 

f{Y)<f(X)~ J2 /Ol X\{j})+ E /O'|0) (61) 

jex\v j£Y\X 

= /po-£/c?ix\{j})+ 5^ / ( iix\{j})+ 5^ /o10) ( 62 ) 

jex jexnY j&Y\x 

= /(X) - g x (X) + g x (Y) = m x (Y), (63) 

where rh x (Y) is the modular upper bound of / associated with the supergradient gx that is tight at Y = X. 
Similar expansions can start with Eqns. (59) and (60) which define rh x (Y) and fh x (Y) as the modular upper 
bounds of /, tight at Y = X, associated with the supergradients gx and gx respectively. □ 


These three supergradients (i.e., Eqns. (53)-(55)) can be used to characterize useful and practical inner 


bounds of the superdifferential. First, we define two helper polyhedra: 

9{(X) = {i £ B" : m < i X}, (64) 

d f v {X) A {X £ r : f(j\V\j) > x(j),Vj £ X}. (65) 

Then we define the following three polyhedra: 

d f (X) = d{(X) n dl(X) = {x£R n : f(j\x\j) > x(j),\/j £ X and f(j) < x(j),Vj £ X}, (66) 

d f (X) 4 d((X) n c>((X) = {xefi": f(j\X) < x(j),\/j i X and f(j\V\j) > x(j),Vj £ X}, (67) 

& f (X) = 4(X) n d{{X) = {.£«": f(j\V\j) > x(j),Vj £ X and /( j) < x(j),Vj $ X}. (68) 

Then note that cX(X) is a polyhedron with gx as an extreme point. Similarly d^(X) has gx, while B^(X) 
has gx, as their respective extreme points. All these are simple polyhedra, each with a single extreme point. 
We also define the polyhedron: 

&f(X) = conv(d^(X), &f(X)) (69) 


where conv(.,.) represents the convex combination of two polyhedra 5 . Then d*(X) is a polyhedron which 
has gx and gx as its extreme points. The following lemma then characterizes these polyhedra and how they 
are inner bounds of the superdifferential: 

Lemma 6.9 (Superdifferential Inner Bound Relationships). Given a submodular function f, 

B f (X) C d f {X) C d f (X) C d f (X), (70) 

B f {x) c B f (x) c d f {x) c d f {x). (71) 

Proof. The proof of this lemma follows directly from the definitions of the supergradients, the corresponding 
polyhedra, and of submodularity. □ 


5 Given two polyhedra Vi,V 2 , V = convfPi, V 2 ) = (A:ri + (1 — X)X 2 , A £ [0,1], x\ £V\,X 2 &V 2 } 
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Figure 8: An illustration to compare the relative positions of the sub- and superdifferentials on the submodular 
function / : 2l 1,2,3 I —► R defined as shown in Table 1. The subdifferentials appear in red, while the 
superdifferential is shown in blue and are defined at X = {1}. Also shown is the point gx defined in Eqn. (72) 
and the corresponding outer bounds ((^^’^(X) and <9^ MX) defined in Eqns (12) and (52)) of the two 
semidifferentials. 


6.2.3 Connections between the subdifferential and superdifferential at X 

There are some interesting connections between <9/(X) and d^(X). Firstly, it is clear from the definitions 
that df(X) C ^^’^(X) anc [ d-f(X) C ^(X). Notice also that both c?^ 1 ’ 1 ^(X) and ^(X) (from 
Eqns. (12) and (52) respectively) are simple polyhedra containing a single extreme point g\ £ R" defined as 
follows: 


ffx(j) = 


fU\X\j ) 

MX) 


if jex 
if HX 


(72) 


The point gx is, in general, neither a subgradient nor a supergradient at X. Each of the semidifferentials, 
df(X) and d^(X), however, are contained within a (distinct) polyhedra defined via gx- In particular, 
gx £ d^ 1,1 \x) and gx £ ^a(i an d 9- x an ex t reme point both of d^ 1 ’ 1 \X) and ^(X). An 

illustration of this is in Figure 8. The subdifferential df(X) is the red polyhedron, while the superdifferential 
df(X) is the blue polyhedron. Moreover, the light red and the light blue polyhedra are (^^’^(X) and 
c>A(i 1 ^(X), respectively, defined at X = {1}. 

Since <9^(1 i)(X) is not a superdifferential, but rather an outer bound on one, the modular function 

m x ( Y ) 4 /(X) - g x (X) + g x (Y) (73) 


is not everywhere a modular upper bound on the submodular function /, although it is tight at Y = X . If 
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we consider, however, a subset [0, X] U [X, V] = {Y £ 2 V : Y C X or 7 3 X} of sets, then a supergradient 
property is retained. 

Lemma 6.10. Given X C V, and any Y £ [0, X] U [X, V], then 


f(Y)<f(X)-g x (X)+g x (Y) (74) 

Proof. Suppose Y C X, then 

/(X) - f(Y) = /(X \ Y\Y) > E f{j\X \ j) (75) 

jex\Y 

If, on the other hand, Y D A', then 

f(Y) — /(X) = f(Y\ X|X) < £ W ( 76 ) 

jeY\x 

Combining the two together yields: 

f(Y)<f(X) + E MX)- E MX\j) (77) 

jeY\x jex\Y 

= /(X) + g x (Y \ X) - g x (X \ Y) = /(X) + g x (Y) - g x (X) (78) 

□ 


6.2.4 Examples of inner and outer bounds for specific superdifferentials 

We next investigate the inner and outer bounds of specific instances. First, consider 
the superdifferential d‘ (0) at the empty set X = 0. In this case, notice that all 
three supergradients are the same vector, i.e % = g® = <?a, with the individual 
elements being g$(j) = g®(j) = gq>(j) = f(j),j £ V. Therefore, in this case, the inner 
bounds (Eqns. (66)-(68)) are exactly the superdifferential itself, and <9-1(0) = <9-1(0) = 

(9-^(0) = <9^(0) = <9-^(0). Also, the largest outer bound <9^ x ^(0) from Eqn. (52) has 
the relationship <9^ ( 11 ^( 0 ) = (ft)) since g$ is also identical to these supergradients. 

Therefore, in this case, all of the inner and outer polyhedral bounds are identical to 
the superdifferential. This phenomenon also occurs for the superdifferential at the 
ground set d^(V). For other sets, however, this does not hold and the relationship 
between the inner and outer bounds and the superdifferential can be strict. 

In the next example, we analyze the inner and outer bounds of the superdifferentials 
for some specific submodular functions in order to get further intuition about them. 

Example 6.3. In this example, we show how in 2-D some of the inner and outer 
bounds are exact, and other of the inner bounds (resp. outer bounds) are strictly smaller (resp. larger) than 
their corresponding exact superdifferentials. Assume the ground set is V = {1,2}. From Lemma 6.6, we know 
that the superdifferentials 9-^(0) and 9-^({l,2}) are simple polyhedra and, as mentioned at the beginning of 
Section 6.2.4, the inner and outer bounds are identical to the superdifferential itself. 

Consider, however, 9^({1}). Recall from Example 6.1 that the extreme points here are {/({l}),/({2})} 
and {/({1}|{2}),/({2}|{1})} respectively. Notice that g = (/({l}),/({2})) and g = (/({1}|{2}),/({2}|{1})), 
and hence both of these supergradients are extreme points of the superdifferential in two dimensions. Also note 
that g = (/({1}|{2}), /({2})). Therefore, g lies in the interior of the superdifferential for a strictly submodular 
function 6 since (considering inequalities in Eqns. (40)-(42) governing <9^({1}) from Example 6.1), we have 

6 A strict submodular function is a submodular function where none of the defining inequalities act as equalities. 


X 

f(X) 

0 

0 

{ 1 } 

1 

{ 2 } 

2 

{3} 

2 

{ 1 , 2 } 

2.5 

{2,3} 

3 

{1,3} 

2.8 

{1,2,3} 

3 


Table 1: An illustra¬ 
tive submodular func¬ 
tion defined on V = 
{1,2,3}. 
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Figure 9: A visualization of the inner and outer bounds of the superdifferential. The submodular function 
is given in Table 1. The shown superdifferential d^(X) is at X = {1}. The first figure (top left) shows the 
submodular supergradient d? (A) itself, while the second one (top right) is the outer bound <9^ ^(X). The 
bottom three figures show the inner bounds ca ({1}), £F({1}), and 9^({1}). The polyhedral inner bound 
= conv(9^({l}),9-^({l})) is not shown. 

X! = /({ 1 }|{ 2 }) < /({!}), x 2 = /({ 2 }) > /({2}|{1}), and Xl - x 2 = /({ 1 }|{ 2 }) - /({ 2 }) = /({ 1 , 2 }) - 
2/({2}) < /({l}) — /({2}) (which follows since /({1},{2}) < /({l}) + f({2})). In this case, therefore, 
Lemma 6.9 becomes <9^({1}) C <9^({1}) C 9^({1}) = 9^({1}) and 9^({1}) C 9^({1}) C 9^({1}) = 9^({1}). 

Similarly, observe from Eqn. (72) that j. = (/({l}),/({2}|{1}) € ^a(i i)({^-}) does not belong to 
c^({l}) when f is strictly submodular since it violates Eqn. (42), i.e., /({1}) — /({2}|{1}) < /({l}) — /({2}) 
(this does not hold since it would require /({2}|{1}) > /({2}) which violates strict submodularity). Hence 
^A(i i)({ 1 }) 7) 9^({1}). The same phenomena is true for <9^({2}). 

We can also consider the superdifferential in the three dimensional setting when V = {1,2,3}. In this case 
we must consider an specific submodular function instance, and this is done in Table 1. Consider <9^({1}). An 
illustration of this is in Figure 9. The various polyhedra are shown shaded in blue. Note that the rectangular 
polyhedron (upper right case) is the outer bound ^(X). In this case, it holds that 

^({1» C df({l}) C 0'({1» C ^({l» C 9£ ( 1 i 1 ) ({1» (79) 

and 

^({i» c ^({l}) c 0 '({i}) c ^({l}) c d£ (lil) ({i» (80) 

That is, the subset relationships are strict in this case. 
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6.2.5 Superdifferentials of subclasses of submodular functions 

While it is hard to characterize superdifferentials of general submodular functions, certain subclasses have 
easy characterizations. An important such subclass of the class of Mf- concave 7 functions [41] defined on 2 V . 
These include a number of special cases like matroid rank functions, concave over cardinality functions etc. 
All Mf- concave functions defined on 2 V are submodular on 2 V but not vice verse. In some sense, AT-concave 
functions very closely resemble concave functions. In particular, one can maximize these functions exactly 
in polynomial time [41]. These functions also admit simple characterizations of their superdifferential. In 
particular, the superdifferential of this class of functions can be represented using only 0(n 2 ) inequalities. 
The following theorem provides a compact representation of the superdifferential of these functions. 

Lemma 6.11. Given a submodular function f which is also M^-concave on {0,1}' , its superdifferential 
satisfies: 

df{x) = d i ( 2 ; 2 ) (x). (si) 

In particular, it can be characterized via 0(n 2 ) inequalities. 

Proof. A set function p is said to be M ^ concave [42], if for any X, Y CV and any i £ X\Y, we have that 
either the following inequality is true: 

p(X) + p(Y) < n{X\{i}) + H(Y u {*}), (82) 

or if not, then there is some j £ Y\X where: 

p(X) + p(Y) < p((X\{i}) U {j}) + p((Y U (83) 

This is called the exchange property for said functions. We then invoke Theorem 6.61 in [41] where the 
authors show that for a M^ convex function (which is supermodular on 2 V and is defined using the opposite 
inequality to the above), its subdifferential (which in fact corresponds to a superdifferential of a submodular 


function) can be expressed by just considering sets Y satisfying |A\Y| < 1, |Y\A| < 1 (i.e., of Hamming 
distance less than two). In particular, we have that, 

d»(X) ={i£K" :x(j) < p(j\X\j),Wj £ X (84) 

Xj >p(j\X),Vj?X (85) 

Xi — Xj < p{X) — p(X U j\i),\/i £ X,j £ A'} (86) 

Hence the superdifferential of a AT concave function (which is submodular) can be expressed with the same 
number of inequalities and the corresponding polyhedron is d^, 2 2 ^(A). □ 


6.3 Generalized Submodular Upper Polyhedron 

In this section, we generalize the submodular upper polyhedron from Section 6.1 in a manner analogous 
to how the generalized submodular lower polyhedron of Section 2.3 generalized the submodular (lower) 
polyhedron of Section 2.1. Unlike for the submodular lower polyhedron case, however, for the generalized 
submodular upper polyhedron some real utility will ensue. 

We define the generalized submodular upper polyhedron as the set of affine upper bounds of / as follows: 

v£ en = {Or, c),x £ R n , cel: x(X) + c> /(A), VA C U} (87) 

Again it is easy to see that Pg en D {(x, c) : c = 0} = {(a:, c) : x £ Vf c = 0}. In other words, the slice c = 0 
of the generalized submodular upper polyhedron is the submodular upper polyhedron of /. Also note that 
the inequality at A = 0 implies that c > 0. This polyhedron shall prove to be useful while defining concave 
extensions of /. The generalized submodular upper polyhedron also has interesting connections with the 
superdifferentials. In particular, we have the following: 

'In this paper, we consider only those M^-concave functions defined on 2' = {0, lU while J\T-concave functions are typically 
defined [41] on TL V . 
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Lemma 6 . 12 . Given a submodular function f, ( x , c) £ Vg en lies on a face of the polyhedron if and only if 
there exists a set X such that x £ d^(X) and c = f(X) — x(X). 

Proof. The proof of this lemma is analogous to the one for the generalized submodular lower polyhedron in 
Lemma 2.5. In particular, observe that (x, c ) lies on a face of V[ en if and only if there exists a set X such 
that x(X) + c = /(X) and for all Y C V,x(Y) + c > f(Y). It then directly implies that x £ d^(X) and 
c = /(X)-x(X). □ 

This then implies the following corollary: 

Corollary 6.13. Given a submodular function f, a point (x, c) is an extreme point ofV^ en , if and only if x 
is an extreme point of d^(X) for some set X and c = /(X) — x(X). 

Proof. Assume that (x, c) is an extreme point of 'P| en . Then, there must be n + 1 sets Xq, X\, ■ ■ ■ , X n such 
that x(X{) + c = f(Xi),Vi = 0,1, 2, • • • , n, and x(X) + c > /(X), VX C V. Set c = /(X 0 ) — x(X 0 ). This 
implies that x(Xi) + f(X 0 ) - x(X 0 ) = /(X;),Vi = 1,2, ■ ■ • ,n, and x(X) + f(X 0 ) - x(X 0 ) > /(X 0 ),VX C V. 
This implies that x is an extreme point of cP (Xo). To prove the other direction, we start with a set Xq, 
such that x is an extreme point of cP (Xo). Set c = /(X q) — x(Xo). Then, following the fact that x is an 
extreme point of cP(Xo), we know that there exist n sets Xi, ■ ■ ■ , X n such that x{Xf) + /(Xo) — x(Xo) = 
/(A*), Vi = 1,2, • , n, and x(X) + /(A 0 ) — x(X 0 ) > f(X 0 ),\/X C V. Substituting for c, we observe that 

x(X i) + c = /(Aj), Vi = 0,1,2, • • • , n, and x(X) + c > /(A), VA C V. This proves that (x, c) is an extreme 
point of Vj[ en . □ 

This implies an interesting characterization of a linear program over the generalized submodular upper 
polyhedron. 

Lemma 6.14. For submodular function f, and ay £ R", 

min [(x, y) + c] = min{ min [(x, y) + /(A) — x(X’)] | ACV}. ( 88 ) 

(x,c)ev f gm xedf(x) 

Proof. We first show that min( x [(x, y) + c] < min{min xea /( X )(x, y) + /(A) — x(X) | A C V}. 

Observe that for any set A, and point x £ d^(X), (x, /(A) — x(A)) £ d^{X). Hence the second expression 
can be obtained by taking only a subset of the polyhedron V^ en , and hence is a upper bound. Next, 
we show that min( I|C ) 6 ^J(i,l/) + c] > min{min a . g9 /( X) (x, y) + /(A) - x(A) | A C V} by invoking 
Lemma 6.12. The minimum on the l.h.s. must occur at an extreme point of Vf en , which implies that 
x £ d*(X) for some set A, and c = /(A) — x(X). Hence this implies that min^, c ^ GV f {x,y} + c > 
min{min a . e g/( A ')(x, y) + /(A) — x(A) | A C V}, since the l.h.s. equals a particular instance of the RHS. This 
completes the proof. □ 

Unfortunately, however, the generalized submodular upper polyhedron is no longer easy to characterize. 
This is related to the fact that the superdifferentials of a submodular function are not easy to characterize. 

Lemma 6.15. The generalized submodular upper polyhedron membership problem for a submodular function 
f (i.e., given an x £ R" and c £ R, solve the query “Is (x, c) £ Pg en ?”) is NP hard for c > 0. Furthermore, 
for any y £ R", solving a linear program over this polyhedron, i.e min^ c ^ v f {x , y) + c is also NP hard. 

Proof. The first part of the result follows from the fact that asking whether (x, c) £ V[ en is equivalent to 
asking whether maxxcv[/(A) — x(X) — c] < 0, which can be rewritten as maxxcv[/(A) — x(X)] < c. This 
is the decision version of submodular maximization, which is NP hard. The second part follows directly 
from the first since the membership problem on a polyhedron is equivalent to a linear program over this 
polyhedron [22, 51]. □ 
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We can also prove the second part (that solving a linear program over the generalized submodular 
polyhedron is NP hard) since it is equivalent to computing the concave extension of a submodular function 
(we show this in Lemma 7.1). Computing, and in fact even evaluating at a point, this concave extension, 
however, is NP hard [9, 54]. 

Recall that in the case of the generalized submodular lower polyhedron, the extreme points of this 
polyhedron were identical to the extreme points of the submodular lower polyhedron (i.e., all extreme points 
of the generalized submodular lower polyhedra occurred when c = 0) — this that the linear program over the 
two polyhedra was the same. This is not the case in the generalized submodular upper polyhedra. To see 
this, we consider a simple example with V = {1,2}. 


Generalized Submodular Upper (Blue) and Lower (Red) Polyhedron 



Figure 10: The top figure shows the generalized submodular upper polyhedron (Eqn. (87)) in blue, while the 
bottom shows the generalized lower polyhedron (Eqn. (13)) in red for a submodular function / : 2^ 1,2 ^ —> R, 
with /(0) = 0, /({1}) = l7/({2}) = 2,/({1,2}) = 2.5. The polyhedra live in three dimensions for two- 
dimensional submodular functions. Notice that all the extreme points (green) of the generalized submodular 
lower polyhedron are on the plane c = 0 - the two extreme points are (1,1.5, 0) and (0.5,2, 0). In the 
generalized submodular upper polyhedron, however, one of the extreme points is on c = 0 (this extreme point 
is (1, 2, 0)), while the other extreme point is (0.5,1.5, 0.5) (here c = 0.5 > 0). 


Example 6.4. First consider the generalized submodular lower polyhedra when V = {1,2}. 

'Pgen = {(x,c) £ l 3 :C < 0, 

£i + c< /({1}), 
x 2 +c< /({2}), 
xi+x 2 + c< /({1,2})} 


(89) 

(90) 

(91) 

(92) 


It is immediate that the only extreme points are (/({l}),/({2}|{1}), 0) and (/({1}|{2}),/({2}), 0), which are 
obtained by setting Eqns (89), (90), (92) and Eqns (89), (91), (92) as equalities. The extreme points in this 
case are a direct product between the extreme points ofVf and c = 0. Hence all extreme points lie on the face 
c= 0. 


This is not the case for the generalized submodular upper polyhedron. Consider again the example with 

V = {1,2}. 
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Example 6.5. The generalized upper submodular polyhedron in this case is, 


c > 0, 

(93) 

Xi + C> /({ 1 }), 

(94) 

X 2 +C> /({ 2 }), 

(95) 

xi + x 2 + c> /({1,2})} 

(96) 


It is again immediate that the only extreme points are (/({l}),/({2}),0) and 

(/({1}|{2}), /({2}|{1}), /({1}) + /({2}) — /({l, 2})), which are obtained by setting Eqns (93), (94), (95) and 
Eqns (94), (95), (96) as equalities (setting the other combination of inequalities as equalities does not give 
extreme points). Hence while one of the extreme points here is {/({1}), /({2}), 0}, which is the direct product 
between Vf and c = 0, the other extreme point occurs at c, and when f is strictly submodular, c > 0. 

An illustration of the generalized submodular upper and lower polyhedra is shown in Figure 10. 

6.3.1 Inner and outer bounds on the generalized submodular upper polyhedron 

In a manner similar to the superdifferential, we can provide inner and outer bounds of the generalized 
submodular upper polyhedron. In particular, let gx £ d? (X) be a supergradient at X that is feasible to 
obtain (such as the ones in Eqns. (53)-(55)). Then, m x (Y) = f(X) + gx(Y) — gx(X) is a modular upper 
bound of f(Y),VY C V. Given any set Q = {gx £ d^{X)\X C V} of such super gradients, we may define a 
poly tope as follows: 


v g, g en ~ conv-hull {(g x ,f{X) - g x (X)),\/X C V,g x € Qj. 


(97) 


Since for any X CV and gx £ d^{X), we have that (g x , f(X) — g x (X)) £ P^ en , it follows from the convexity 
of pf en that Eg, gen ^ ^gen- Moreover, larger inner bounds of Pg en can be obtained by taking the convex 


hull of multiple such polytopes of the form Vg gcn for various Q. We shall in particular be interested by the 
polytopes Q = {g x \X C V }, Q = {g x \X C V}, and Q = {g x \X C V}, formed using Eqns. (53)-(55)), which 
we will refer to with vl , pi , and . These bounds, as we shall see, have interesting connections 

Q, gen’ 0, gen’ O ren > > ° 


0 , gen' 


to concave extensions (which we shall describe Section 7) and ultimately to submodular maximization. 

In a fashion analogous to how, in Section 6.2.1, we defined outer bounds on the submodular differential, 
we can similarly define outer bounds of the generalized submodular upper polyhedron by considering only a 
subset of inequalities that define P[ en - We do not pursue this here and leave it to future work (see Section 10). 


7 Concave extensions of a submodular function 

Following the characterizations of the convex extensions of a submodular function, we can define the concave 
extensions also from two viewpoints, one in the distributional setting and another in the polyhedral setting. 
These results follow in the lines of the results shown in Section 3 for the convex extensions. 


7.1 Polyhedral characterization of the concave extension 

Similar to the convex extension, the concave extension of any set function (not necessarily submodular) can 
be seen as the pointwise supremum of concave functions that lower bound the set function [9j. Precisely, let 


\H/ = {if : if is concave in [0, l] v and il’{lx) > f(X),VX C V}. 
Then define the concave extension / : [0, l]l y l —> R as follows: 

f{w) = min i p(w). 


(98) 


(99) 
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Following arguments similar to the convex extension, Eqn. (99) can be expressed as a linear program over the 
generalized submodular upper polyhedron. 

Lemma 7.1. The concave extension in Eqn. (99) for any set function f can be expressed as: 

f(w)= min [{y, w) + c] (100) 

(y,c)erf m 

Proof. The proof of this lemma follows the proof of Lemma 3.1. For a given w , let if be an argmin in Eqn. (99). 
Then since ip is a concave function in [0,1]^, there exists a supergradient x £ R ra at w and value d , such that 
(x, y) + d> ip(y),\/y and (x, w) + d = ip(w). In other words, (x, y) +d is a linear upper bound of ip(y), tight at 
w. Hence f(w) = (x, w) + d. Finally notice that (x, d) £ "P| en since x{X) + d > ip(lx) > f(X),\/X C V. □ 

Unlike the case shown in Lemma 3.2 for the convex extension, however, this is not equivalent to an 
optimization over the submodular upper polyhedron. That is, we may not assume c = 0 in Eqn. (100) for a 
submodular function. Moreover, this expression requires solving a linear program over the submodular upper 
polyhedron, and it follows from Theorem 6.15 that obtaining the concave extension is NP hard. We shall 
revisit this result in the next subsection while investigating the distributional characterization. 


7.2 Concave upper and lower bounds of the concave extension 

Interestingly, we can define a number of concave extensions based on relaxations of the polyhedral represen¬ 
tation. In particular, consider the inner approximations of the generalized submodular upper polyhedron 
Vg gon , defined via a particular set of supergradients Q = {gx £ d^(X)\X C V}. Instead of minimizing over 
all affine upper bounds, we can minimize only over a particular class of modular upper bounds. Then, we can 
define the following form of a concave extension: 

fs( w ) — min , i(y,w) + c] = min [(y, g Y ) + f(Y) - g y (Y)\, Vwe[0,l] |v| (101) 

(vx)ev f gteBI1 

In particular, the above turns the linear program into a discrete optimization problem. Moreover, the concave 
extension fg is guaranteed to be an upper bound of /. We can define three variants of these extensions using 
the polytopes Q = {gx\X C V}, Q = {gx\X C V}, and Q = {gx\X C V} and which we call fg, fg, and fg. 
These concave extensions can, in fact, be obtained in polynomial time since it involves submodular function 
minimization for each evaluation. 

The class of concave extensions suggested by Eqn. (101) has some connections to a form of concave 
extension proposed in [54] for monotone submodular functions. In particular, where [54] defined a concave 
function f g that takes the following form: 

fg v (x) 4 min{[/(Y”) + £ x(j)f(j\Y)]\Y C U} = min{[/(F) + £ x(j)f(j\Y)]\Y C U} (102) 

lev j<£Y 

This extension can be seen as a special case of Eqn. (101) with a particular set of supergradients Q v = {gx £ 
df(X)\X C V} defined as: 


9x{j) 


0 if j £ X 

f(j\X ) i£j$x 


(103) 


This supergradient is related to the supergradient gx in Eqn. (54) except that it replaces the values f(j\V\j) 
for j £ X with 0. For a monotone submodular function, this remains a supergradient (but not for a 
non-monotone submodular function). This form of concave extension is NP hard to evaluate (see Section 3.7 
in [54]) but is still useful in obtaining approximate maximizers for certain special cases (see Section 7.5). 

Using outer bounds of the generalized submodular upper polyhedron, defined by considering only a subset 
of inequalities that define Vf en , it would be possible to define tractable lower bounds on the concave extension. 
We leave this to future work (see Section 10). 
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7.3 Distributional characterization of the concave extension 

As with the convex extension and as shown in Section 3.2, an alternate and equivalent characterization of the 
concave extension can be viewed through a distributional lens. 

Lemma 7.2. Recall from Eqn. (20) the set A w defined here again for convenience: 

A w = {{As, SCb}: ^ Agls = w, E As = 1, and VS, As > 0 j. (20) 

scv scv 

The concave extension from Eqn. (100) then can also be represented as: 

f(w) = max E x sf{S) (104) 

AeA„ z ' 

scv 

The proof of the above follows on similar lines as the convex extension, and is shown in [9], Unfortunately, 
unlike the convex extension, this extension is NP hard to evaluate and optimize over. 

Proposition 7.3. Given a submodular function f, it is NP hard to evaluate and optimize f. 

This result is shown in [54], 

Similar as for the polyhedral characterization, we can relax the distributional characterization to consider 
specific simplified distributions. In particular, we can obtain the multilinear extension, through a particular 
distribution, namely {A s = TlicS Xi Tli^s^ — x i)i & C} £ A x . Then the multilinear extension is defined as 
follows: 

/>)=e x sf ( s )=e n ^ na - *o ( ios ) 

SCV SCV ies it£S 

It is not hard to see that this forms a lower bound on the concave extension /. This extension is not concave, 
however, unlike the extensions described in Section 7.2 that are. Similar to the concave extension it is hard to 
evaluate this extension, and typically requires sampling [54] in practice to get an estimate, although special 
cases exist where it can be analytically expressed and computed exactly [32]. 

7.4 Concave extensions of subclasses of submodular functions 

While the concave extension f(x) is NP hard to compute in general, it can be done efficiently for certain 
subclasses of submodular functions. These include, for example, sums of weighted matroid rank functions [54], 
and the class of concave functions (c.f. Theorem 6.42 in [41]). 

7.5 Concave extensions and submodular maximization 

The concave extensions and the multilinear extension have interesting connections to submodular maximization. 
The following lemma from [54] connects many of these extensions: 

Lemma 7.4. [5f] For every monotone submodular function /, /( x) > f(x) > (1 — A )f(y )• 

It is also possible to relate all of the three extensions of a submodular function, namely the convex 
extension, the concave extension, and the multilinear extension. 

Lemma 7.5. Given a submodular function, it holds that 

f(x) > f{x) > f(x) ( 106 ) 

Proof. The proof of this result follows directly from the distributional characterization of the convex and 
concave extensions, Eqns. (21), (104), and (105). Note that the multilinear extension uses a particular 
distribution, the concave extension is a pointwise maximum over all such distributions, while the convex 
extension is a pointwise minimum over these distributions. □ 
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The facts above were used in providing a relaxation based algorithm for maximizing a subclass of 
submodular functions efficiently [54]. This relaxation based algorithm maximizes the concave extension 
f(x) that, while NP hard to optimize in general, can be maximized in certain special cases. The particular 
special case which is considered in [54] is the class of weighted matroid rank functions for which the concave 
extension has a simple form. Furthermore, a pipage rounding method ensures no integrality gap with respect 
to the multilinear extension, thus providing a 1 — - approximation algorithm for the problem of maximizing 
a monotone submodular function subject to a matroid constraint. Furthermore, later, a conditional gradient 
style algorithm, also called the continuous greedy algorithm [55], directly optimizes the multi-linear extension 
thereby providing a general 1 — 1/e approximation algorithm for monotone submodular maximization subject 
to matroid constraints. This was later extended to the non-monotone case by [6]. 

8 Optimality Conditions for submodular maximization 

Just as the sub differential of a submodular function provides optimality conditions for submodular minimiza¬ 
tion, the superdifferential provides the optimality conditions for submodular maximization. 

8.1 Unconstrained submodular maximization 

In this section, we consider the general problem of unconstrained submodular maximization: 

max/(X) (107) 

Given a submodular function, we can give KKT-like conditions for submodular maximization, and this is 
done in the following theorem: 

Lemma 8.1. For a submodular function f, a set A is a maximizer of f, if 0 £ d*(A). 

However as expected, finding the set A, with the property above, or even verifying if for a given set 
A, 0€ d-f(A) are both NP hard problems (from Lemma 6.5). However thanks to submodularity, we show 
that the aforementioned outer bounds on the superdifferential provide approximate optimality conditions for 
submodular maximization. Moreover, unlike the superdifferential, these bounds are easy to obtain. 

Proposition 8.2. For a submodular function f, if 0 £ ^(A) then A is a local maxima of f (that is, 

\/B D A, /(A) > f(B) and VC C A, f(A) > f(C)). Furthermore, if we define S = argmax Xe j- A y \ A } /(A), 
then f(S) > |OPT where OPT is the optimal value. 

The above result is interesting since a very simple outer bound on the superdifferential leads us to an 
approximate optimality condition for submodular maximization. The local optimality condition follows 
directly from the definition of ^(A) and the approximation guarantee follows directly from Theorem 3.4 
in [12]. 

We can also provide a sufficient condition for the maximizers of a submodular function. 

Lemma 8.3. If for any set A, 0 £ £P(A), then A is the global maxima of the submodular function. 

Proof. This proof follows from the fact that d^(A) C d?(A) C jAA). Thus, if 0 £ d^(A), it must also 
belong to cA(A), which means A is the global optimizer of /. □ 

Based on Lemmas 8.2 and 8.3, if a local maxima A is found (which is relatively easy because ^(A) 

is easy to characterize) and if it happens that 0 £ cP(A) (which is easy to check since d? (A) is easy to 
characterize), then we have a certificate of a global maxima of the submodular function. 
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8.2 Constrained submodular maximization 

We can also provide similar results for constrained submodular maximization as we show in the present 
section. We consider a constrained submodular maximization problem with C C 2 V representing a set of sets, 
and consider the following problem: 

max/(X) (108) 

A EC 

For example C could represent a cardinality constraint {X C V : |X| < m}, or a spanning tree, matching, 
s-t path constraints, etc. Another common type of constraints are matroid independence constraints (i.e., C 
consists of the set of independent sets of some matroid).Denote I is the independent set of a matroid M. 
Then C = {X : X G 1} is a matroid constraint. Similarly C = {X C V : c(X) < B} represents a knapsack 
constraint. We therefore refer to these as combinatorial constraints. 

We then define a constraint-cognizant modification dj(,(A) of the superdifferential (Eqn. (32)) as follows: 

VIeCC 2 y , d f c (X) f(Y) - x(Y) < f(X) - x(X),VY G C} (109) 

In other words, we only consider the feasible sets associated with the constraints. Then we can trivially define 
a KKT like optimality condition for the optimization problem: 

Lemma 8.4. For a submodular function f, a set A is a maximizer of the problem maxxec /(-Y), if 0 G 8q(A). 

Clearly finding the set above is NP-hard. However, similar to the unconstrained setting, we show that, in 
a number of cases, approximating the superdifferential can lead to polynomial time algorithms for constrained 
submodular maximization with worst case approximation guarantees. This is done in several scenarios as we 
next discuss. 

8.2.1 Constrained monotone submodular function maximization 

Consider here a case where / is a monotone submodular function, and C is the constraint that the set belongs 
to the intersection of the independence sets of k matroids. Let A4i,Ai 2 , ■ ■ ■ ,Xik represent the k matroids, 
with corresponding independence sets • • • ,Ik- Then C = {X : X G Analogous to how we 

defined outer bounds d A ^ k ^ (X) on the superdifferential in Section 6.2.1 and Eqn. (50), we can also define the 
outer-bounds d£ A , k ^ of df, that correspond to d A ^ k MX) but that also are restricted to C in the sense of 
Eqn. (109). We then make the following observation for the problem of monotone submodular maximization 
subject to matroid constraints. 

Observation 8.1. Given a monotone submodular function f and a constraint set C = ni =1 Z;, for any set 
A G C, the following holds: 

1. If 0 G d c A ^ 2 fc+1 )(A), then f(A) is guaranteed to be at least times the optimal value. In particular, 
for the special case of monotone submodular maximization subject to a single matroid constraint, for 
any set A G C, if 0 G 2 )(^)> then f(A) is guaranteed to be at least ^ times the optimal value; 

2. If k = 1 and C is a cardinality (uniform matroid constraint {X : |X| < m}), for any r > 0, a set A 
satisfying 0 G A ( r+1 r+1 j is guaranteed to have an approximation guarantee no worse than 2 m-r • 

The first part of the observation (i.e., point 1) follow directly from Corollary 2.4 in [37]. In the case 
of k = 1, the same result was shown in [15]). Moreover, it was also shown in [15], that for the problem 
of monotone submodular maximization subject to k > 1 matroid constraints, the approximate optimality 
conditions 0 G d c 2 )(A), can be arbitrarily bad, thus requiring “higher order” optimality conditions. We 
also remark that when r = 1 (i.e., submodular maximization subject to a single matroid constraint), this is 
the same approximation factor that can be obtained by the simple greedy algorithm [45]. 

The second part (point 2) of the above observation (which is submodular maximization subject to cardinality 
constraints) follows from Theorem 8 in [14]. In the case when r = 1, the condition 0 G 3^ A j 22 j provides a 
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guarantee of m/(2m — 1) which is a slight improvement of 1/2 in the special case of cardinality constraints. 
An interesting observation is that with better forms of local optima (i.e., the condition 0 £ A , +11 , +1 p is a 
local optima up to size r), imply better approximation guarantees to this problem. 

The approximation factor for k > 2 matroids can actually be improved as shown in [38]. 

Observation 8.2. Given a maximization problem of a monotone submodular function f subject to k > 1 
matroid constraints, for a set A £ C, if 0 £ A(p+i kp+ 1 )^)> then f(A) is guaranteed to be at least k +\/ p 
times the optimal value. In particular, for a special case of monotone submodular maximization subject to 2 
m.atroid constraints, for any set AgC, if 0 £ A(p+i 2 p+i)(A), then f(A) is guaranteed to be at least 2 + 1 /p 
times the optimal value. 

This is currently the best known result for k > 1 matroids, a result that follows from Corollary 3.1 in [38]. 
Overall, the main insight in these results is that the local optima which are obtained through the local 
search algorithms can all be viewed as (approximate) optimality conditions obtained via outer bounds of the 
superdifferential of a submodular function. 

8.2.2 Constrained non-monotone submodular function maximization 

Finally, we consider the case of non-monotone submodular maximization subject to k matroid constraints. 
We first consider the case of symmetric submodular functions, i.e., f{A) = f(V \ A) for all A C V. 

Observation 8.3. Given a symmetric submodular function f, 

1. If the constraint set is as follows C = then any set A £ C satisfying 0 £ d(, k+1 ^(A), f(A) is 

guaranteed to be at least times the optimal value. 

2. If C is the set of bases of a matroid, then any set A satisfying 0 £ A (2 2 )(^) guaranteed to have a 

valuation at least 1/3 of the optimal. 

The results in this proposition follow directly from the definitions above, and through the results in [37] 
(the first part follows from Theorem 2.8, while the second part is implied by Theorem 5.1). 

We lastly provide an approximation bound in terms of superdifferentials for non-monotone submodular 
maximization. 

Observation 8.4. Given a non-monot.one submodular function f, and C is a cardinality (uniform matroid 
constraint { X : \X\ < m}), for any r > 0, a set A satisfying 0 £ (r+i r+i) guaranteed to have an 
approximation guarantee no worse than 2 m-r ■ 

This result follows directly from Theorem 8 in [14], The best bounds for non-monotone submodular 
maximization require running several iterations of local search procedures. In particular, the procedure of [37] 
runs k + 1 local search procedures to obtain a l/(fc + 2 + 1 /k) approximation algorithm for non-monotone 
submodular maximization subject to a single matroid constraint. When k = 1, running two rounds of this 
local search procedure results in a 1/4 approximation. The individual local search procedures here obtains a 
set A satisfying 0 S d f c (2 

9 Concave Characterizations: Discrete Separation Theorem and 
Fenchel Duality Theorem 

In Section 5, we investigated forms of the Discrete separation theorem (DST), the Fenchel duality theorem, 
and the Minkowski sum theorem for submodular functions and their associated polyhedra when seen from the 
convex perspective. We here analyze forms of the discrete separation theorem and Fenchel duality theorem 
for submodular functions and their associated polyhedra from the concave perspective. 
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9.1 (Concave) Discrete Separation Theorem 

We first show that a restricted version of a form of Discrete separation theorem holds that in some sense is 
the opposite of the Frank’s DST shown in Section 5.1 (Lemma 5.1). In particular, we see how the current 
result is a form of “concave-like” variant of the Discrete separation theorem. This shows that, under some 
very mild restrictions, given a submodular function / and a supermodular function g with /(X) < g(X),VX, 
there exists a modular function h such that /(X) < h(X) < g(X). This therefore shows how submodularity 
can be seen as analogous to concavity, in the same way that Lemma 5.1 shows how submodularity can be 
seen as analogous to convexity. The lemma follows: 

Lemma 9.1 (Concave Discrete Separation Theorem (CDST)). Given a submodular function f and a 
supermodular function g, such that /(X) < g(X),\/X C V, such that either /(0) = g(0) or f(V) = g(V), 
there exists a modular function h such that /(X) < h(X) < g(X),VX C V. Moreover, when f and g are 
integral (and satisfy the above conditions), there exists an integral h satisfying the above. 

Proof. Assume first that /(0) = <?(0). Then Let h(X) = /(0) + Yhjex /(j|0)- Then the following chain of 
inequalities hold: 

f{x) < h(x) = /( 0 ) + Y fU 10) < 5(0) + E < 5(A), (110) 

jex j&x 

which follows since f(j |0) = f(j) — /(0) < g{j ) — g(0) = g(j |0). The rest of the inequalities follow from 
submodularity (and supermodularity) of / (and g). The result for when f(V) = g(V) analogously follows by 
considering the functions f(V\X) and g(V\X) which are submodular and supermodular respectively. □ 

In particular, when / and g are normalized /(0) = g(0) = 0, the above always holds with a normalized 
(i.e., /i(0) = 0) modular function. Lemma 9.1, however, can in fact be further generalized. In particular, it is 
not hard to see that the result goes through whenever argminx[<?(X) — /(X)] is either 0 or V. To show this, 
we provide a generalized form that depends on the outcome of argmin^ [g{X) — f(X)]. 

Lemma 9.2 (Generalized Concave Discrete Separation Theorem). Given a submodular function f and a 
supermodular function g, such that f(X) < g(X),\/X C V, let A £ argmin^[<?(X) — f(X)]. Then there exists 
a modular function h such that f{X) < h{X) < g(X), \/X : X C A and X A A. 

Proof. The proof of this result is analogous to the earlier one. First, define a = minx g(X) — f(X) and 
A £ argminxg(A') — f{X). Then a > 0, and we can define function /' with f'(X) = f(X) + a so that 
f{A) = g(A). Then given a modular separating function h with f'(X) < h{X) < g(X) for all X, we have 
that f'(A) = h(X) = g(A), and that f(X) < h(X) < g(X) for all X. Hence, given A £ argmin Y g(X) — f(X), 
we may and do assume without loss of generality that f(A) = g{A). 

Next, define 


h{X) = f(A) + Y fU W- E /O W) (m) 

j£X\A j&A\X 

By Lemma 6.10, we have that VX £ [0, A] U [A, V ], /(X) < h(X) = f{A) + c) a{X ) — <)a{A). Moreover, since 
f(A) = g(A), and applying a supermodular variant of Lemma 6.10, we have that VA' £ [0, A] U [A, V ]: 

h(X) = f(A) + E fU\ A ) - E fiiW) (112) 

jex\A j&A\X 

<g(A)+ E 9{j\A)- E 9 (M\j) < g(X). (113) 

jex\A jaA\x 

□ 
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Observe that Lemma 9.1 is a special case of Lemma 9.2 as when /(0) = <?(0) (resp. f(V) = g(V)) we have 
a = 0 and we may take A = 0 (resp. A = V). The discrete separation theorems, however, might not hold 
under the most general conditions on / and g. However, they do hold for certain important subclasses. For 
example, if / is a M^- concave function (which is submodular when restricted to 2 V ), and g is a convex 
function (which is supermodular when restricted to 2 V ), the discrete separation theorem always holds (c.f 
Theorem 8.15 in [41]). 

9.2 Superdifferential Fenchel Duality Theorem 

Finally, we show that a version of the Fenchel duality theorem also holds in certain restricted cases. Given a 
submodular function / (or equivalently supermodular function g), define the concave Fenchel dual functions 
/*, and correspondingly g*, as: 

f4y) = mm[y(X) - f(X)], and g*(y) = max[y{X) - g(X)]. (114) 

.A. ^ V yv 1— V 

The Fenchel duals /* and g* are concave and convex functions respectively. Unlike the convex Fenchel duals, 
obtaining these expressions exactly is NP hard, since they correspond to submodular maximization. These 
can however be approximately obtained up to constant factors. The following lemma gives a restricted version 
of Fenchel duality theorem: 

Lemma 9.3. Given a submodular function f and a supermodular function g such that the concave discrete 
separation theorem holds, 


max f(X) - g{X) = min g* (x) - /*(:r) (115) 

XCV x 

Furthermore, if f and g are integral and satisfy the CDST, the maximum on the right hand side is attained 
by an integral vector x. 

Proof. The proof of this result follows directly from Theorem 4 of [21]. In particular, [21] show that given set 
functions / and g such that the discrete separation theorem holds, the Fenchel duality theorem will also hold 
for this pair of functions / and g. □ 

Unlike the Fenchel duality theorem from the convex perspective, the result above does not hold in the 
fully general setting. Moreover, if the functions / and g are concave and convex respectively, the 
Fenchel duality theorem always holds (c.f Theorem 8.21 in [41]). 

9.3 Superdifferential Minkowski sum theorem 

Analogous to the results above, we show a certain restricted form of the Minkowski sum theorem. 

Lemma 9.4. Given two submodular functions /i and fi, it- holds that that 8 : 

-ph+h =v h +V h (116) 

Similarly, 0A+A(0) = 0A(0) + 0*(0) an d d^ 2 {V) = d^{V) + d* 2 (V). 

Proof. This result follows directly by considering definitions. In particular, one can see that the extreme points 
of these polyhedra can be explicitly characterized by a submodular function. For example, the polyhedron 
V* 1 has a single extreme point defined by the vector fi(j),j G V. Similarly, the extreme point of “P* 2 is 
•/* 2 (j) j j € V, and the extreme point of ■ph+A j s (j) _|_ f 2 (j),j G V, and hence the Minkowski sum theorem 
holds. □ 

8 Recall, the addition of the polyhedra corresponds to point-wise addition 
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Unlike the Minkowski sum theorem on the subdifferential and submodular polyhedron, the result above 
may not hold for the superdifferential d^(X) of an arbitrary set X C V, nor must it hold for the generalized 
submodular upper polyhedron 'P| en . They do hold, however, for certain subclasses of submodular functions, 
such as (once again) the class of concave functions. This fact follows from Theorem 3 in [21], and the 
fact that concave functions satisfy the Fenchel duality theorem. 

10 Conclusions and Future Work 

In this manuscript, we investigated several connections between convex and concave aspects of submodular 
functions. We provided characterizations of the superdifferentials, concave extensions and separation and 
duality theorems related to concave aspects of a submodular function, and connected these new results to 
existing results on the convex aspects of submodular functions. To our knowledge, this is the first work 
in this direction. We also show how for specific subclasses of submodular functions, such as the class of 

-concave set functions, this characterization is exact, while for other submodular functions, this can be 
done approximately. 

We lastly discuss a few problems that remain open and that could be considered for future work. 

• Are there are other subclasses of submodular functions (apart from the class of M ^-concave set functions) 
for which the concave aspects, like the superdifferentials, concave extensions and characterizations 
like the discrete separation theorem, Fenchel duality theorem, and so on, can be provided exactly. In 
particular, we saw that the AU-concave set functions satisfy the property that d^(X) = d^ 2 2 )(AT). 
An interesting question is whether there are other interesting subclasses of submodular functions exist 
that satisfy similar conditions on their superdifferential (e.g., d^(X) = d^ k k ^(X) for constant k). 
Characterizing such functions could lead to exact polytime in |V| (although exponential in k ) algorithms 
for maximizing such subclasses of submodular functions. 

• In section 8, we investigated optimality conditions related to submodular maximization and its connection 
to the superdifferential. An interesting open problem is if this characterization could provide insight 
into algorithms for submodular maximization, and conditions when submodular maximization can be 
done exactly. Moreover, it also is interesting that approximating the superdifferential provides different 
approximation algorithms for submodular maximization. It will be interesting if there is a principled 
relationship between these two. 

• In Section 9, we study the Fenchel duality theorem, the discrete separation theorem, and the Minkowski 
sum theorem. We show that these results hold under restricted settings. An open question is if Edmonds 
intersection theorem (cf. Section 4.1 in [20]) also holds under certain restricted settings. 

• In Section 6.3.1 we defined inner bounds on the generalized submodular upper polyhedron and mentioned 
how it would also be easy to define outer bounds on this polyhedron. Then in Section 7.1 we used these 
inner bounds to produce upper bounds on the concave extension of a submodular function. An open 
question is if the aforesaid outer bounds, which would produce corresponding tractable lower bounds 
on the concave extensions, would be useful for optimization or certain applications. 

• Figure 7 geometrically suggests that a rotation of the axes could transform the set of superdifferentials 
into a set of subdifferentials. It would be interesting to see if in the general case, for arbitrary size 
V, if some multi-axis rotation might perform a similar rotation, and to understand the complexity of 
identifying this rotation (something that at least must be NP-hard do the hardness of submodular 
maximization). 

• It may be interesting to consider the Lasserre hierarchy[48], and how it might be related to the 
complexity of the inequalities needed to characterize the superdifferential and bound the complexity of 
computing submodular maximization for certain submodular functions. 
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• Additional analytical expressions of set functions are M^-concave when the family of sets is restricted 
to a Laminar family of subsets of 2 V [42]. It may be elucidating to consider how the superdifferential, 
and in particular d^ 2 0 ^(X), relates to Laminar families. 

• Finally, thanks to the Minkowski sum theorem, the Lovasz extension of a submodular function satisfies 
that fi + 2 {x) = fi(x) + f' 2 (x), where fi+ 2 (x) = fi(x) + f 2 {x), i.e., the Lovasz extension of a sum of 
two submodular functions is equal to the sum of the individual Lovasz extensions. An open problem 
is whether this relation holds (under restricted settings possibly) for the various concave extensions 
defined in this manuscript. 
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